zotero server code release

Sean Takats

unread,

Jun 21, 2010, 11:08:33 AM6/21/10

to zoter...@googlegroups.com

The Zotero data server code will at long last be made publicly available at the end of this week, and to prepare for its release we would like to encourage some discussion regarding its future development. Bruce suggested the following two questions to get us started, but surely there are many more that will arise:

1. What do we want to achieve?

2. What arrangements (social, technical, etc.) best allow us to achieve these goals?

I'll take this opportunity to say just a few words in response to each question. First, in terms of desired outcomes, I would like to see the development of the server code accelerate in known directions (e.g. continue to build out the API) as well as to see the unexpected ingenuity that this injection of fresh talent will undoubtedly produce. Frank Bennett's amazing work to build a new citation processing engine, now folded into the Zotero client trunk, is a great example of what we might achieve.

Second, in terms of a social and technical infrastructure designed to encourage such development, I'm inclined to offer just a minimum of suggestions until we all have the chance to hear what the members of the Zotero development community think. For starters, we might ask whether this mailing list can expand its scope to encompass discussions of the server code. Given its relatively low traffic, I would say probably yes. Even if that's the case, however, the Zotero development team directed by the Center for History and New Media won't be able to provide technical support. Nonetheless, we're putting together some extremely basic documentation that should be enough to get the most technically-savvy among you started, and this documentation can of course be fleshed out by the community as time goes on.

Looking forward to getting your feedback.

Best regards,
Sean

alex

unread,

Jun 21, 2010, 3:17:57 PM6/21/10

to zotero-dev

Hi everyone, this is alexuw from the zotero forums. I'm very
interested in working with the server, but I think it should be a
project distinct from zotero. I've already started working on it, and
have talked to a few people in the zotero community about it.

At this point there are many possible directions. I'd like to start
with the basics: building a server that is well-documented, easily
installed, and interoperates with the zotero firefox plugin, but
independent of the zotero.org infrastructure. Some users require that
independence for reasons of confidentiality, legal restrictions,
privacy concerns (not zotero but the patriot act).

I think this needs to be a project separate from zotero. For one
thing, the official project has shown no interest in such efforts,
even seems resistant. For another, there are divergent goals here.
Zotero's longer-term direction is toward building a community at
zotero.org, and gaining value from the central data store. People
running their own servers is a different direction entirely. These
are incompatible missions at some level, but that doesn't mean they
can't both thrive, and even produce some positive feedback in both
directions. Finally, I'll be honest, I'm not comfortable with
zotero's development process - IMO too opaque and tightly-controlled.

A separate, more specialized, project could actually strengthen the
core project. Certainly I want to contribute code back to the main
line, and documentation as well. In the longer term I'd like to
explore possibilities for continuous integration. Besides that, being
server-centric, the new project will have a different culture and I
think a counterpart to the existing project centered on the GUI and
zotero.org.

That's about it, I think a separate repository and separate forum for
discussion is appropriate. In relation to zotero.org, cooperative and
collaborative from day one, but distinct. I've decided to take the
initiative myself, and am setting it up now. Whether it succeeds
depends on others contributing and collaborating, of course. But
whatever happens with that I'm keenly interested in having a lively
discussion - here, at zotero.org, in the new project. Whether you
agree with me or not. It's exciting to me that zotero might become
something larger, less self-contained, more part of a software
ecosystem.

Alex

Bryan Bishop

unread,

Jun 21, 2010, 4:07:49 PM6/21/10

to zoter...@googlegroups.com, Bryan Bishop

On Mon, Jun 21, 2010 at 10:08 AM, Sean Takats wrote:
> as well as to see the unexpected ingenuity that this injection of
> fresh talent will undoubtedly produce.

One of the "unexpected" areas that I'd like to contribute to is an
alternative framework for translators. At the moment it's just a
little toy in my mind. The idea is to host the translators on the
server-side, then send HTML pages over to the server to get parsed,
and return some JSON or YAML. The reason why this would be worthwhile
is that it's a single point of failure that would get updated
immediately if the translator/parser gets out of date. Translator
contributors would write code in some other language- not javascript-
so that the translators don't have to run on a browser in a VM behind
the server (blah). If anywhere there was opportunity to get this
rolled out, it's zotero.org :-).

- Bryan
http://heybryan.org/
1 512 203 0507

Bruce D'Arcus

unread,

Jun 21, 2010, 6:08:06 PM6/21/10

to zoter...@googlegroups.com

On Mon, Jun 21, 2010 at 3:17 PM, alex <alexuw.z...@gmail.com> wrote:

...

> I think this needs to be a project separate from zotero. For one
> thing, the official project has shown no interest in such efforts,
> even seems resistant. For another, there are divergent goals here.
> Zotero's longer-term direction is toward building a community at
> zotero.org, and gaining value from the central data store. People
> running their own servers is a different direction entirely. These
> are incompatible missions at some level, but that doesn't mean they
> can't both thrive, and even produce some positive feedback in both
> directions. Finally, I'll be honest, I'm not comfortable with
> zotero's development process - IMO too opaque and tightly-controlled.

...

> That's about it, I think a separate repository and separate forum for
> discussion is appropriate. In relation to zotero.org, cooperative and
> collaborative from day one, but distinct.

So I suppose I could ask this question one way ("what changes do you
think would have to be made with zotero proper to avoid the need for
this?") but will instead ask: what concretely are you thinking that
will enable a separate project to be "cooperative and collaborative
from day one"? What sorts of metrics could we look to measure these
sorts of things?

Bruce

Bruce D'Arcus

unread,

Jun 21, 2010, 6:13:32 PM6/21/10

to zoter...@googlegroups.com

Given my repeated ranting about translators, I'm happy to see this
sort of initiative. But I would suggest you start by articulating
concrete use cases before settling on particular technical solutions.
E.g. I want whatever solution to enable:

1) easier translator authoring

2) mobile and other non-Firefox client support

3) portable translators (right now, I'm interested in the Sakai 3
library integration project being able to reuse the work)

4) ____ ?

BTW, beyond Zotero, the other projects that use similar solutions
include CiteULike, Connotea, Mendeley.

Bruce

Kieren Diment

unread,

Jun 21, 2010, 8:28:29 PM6/21/10

to zoter...@googlegroups.com

I found working with zotero as a programmer rather frustrating until I started writing example driven documentation about the API, and found. As a user I'm very happy, although I'm happier now that I can add my own visualisations and queries independently of the Zotero gui - and I've been doing some quite sophisticated stuff with tags and a quasi-systematic literature review recently.

I'd also probably echo some of Alex's concerns about the development process (I'm used to a very anarchistic model in the other open source code I'm involved with). However, I suspect that this is partially to do with the fact that zotero is primarily application code, and the anarchistic model seems less appropriate for this kind of code base (e.g. linux, open office), more suitable for the development of tools. However it's still a laudable aim to attempt to keep the barrier to entry for development in the zotero ecosystem as low as possible. How to achieve this?

Well the community of developers around CSL shows that the anarchistic model certainly works well for a project with well defined aims where unit tests are easy to write. I haven't contributed because I haven't had the need, and besides, XML brings me out in a rash. Mind you, I do watch with interest.

However, that's a side issue. What do we need to get the barrier to entry for Zotero as low as possible? The work I've done with zotero-browser is a start - it makes getting started with programming the zotero API as easy as PHP development. I haven't looked at baking zotero+POW+xulrunner together to provide a headless zotero repository, but that's got a lot of potential as an approach. However the POW firefox extension seems insufficiently robust, and the maintainer appears to lack the resources for getting a version that works on all platforms working with FF 3.6. This super-easy-development environment (which would be useful for development of bookmarklets, and prototyping the things to get zotero working outside of firefox) would be a valuable addition to zotero - but I suspect that resources would need to be put in to make it robust, and possibly better integrated with zotero (e.g. addition of convenience functions). This would be a great Google summer of code project.

What's this got to do with the server? Directly, not much. Indirectly, we need to describe the core tools for the zotero ecosystem from the perspective of a developer/administrator. As far as I can see, this comprises: The application, the headless xulrunner zotero, my easy dev environment, and the sync server.

Providing these tools so that they scale from the smallest scope (lone developer) to the largest deployment (e.g. zotero.org) without too much effort on behalf of the end user is in my experience probably the best way to get an environment where the social/technical etc. issues become obvious, the process becomes low friction, and the barrier to entry is as low as possible. This low barrier to entry is of critical importance, as the user base for zotero is not your typical programmer, but more likely a researcher who happens to do some programming (like me).

The other part of the process is while it's clear that the zotero application needs to be in some sense centrally controlled, the anarchistic model of open source contribution is a much better way to develop a vibrant ecosystem, so having tools in place that make that possible will massively enhance the potential of zotero's tools to have a major influence for the better in aspects of scholarly communication.

A niggle I talked about before, which is more important from a programmer perspective than a user perspective is to have the documentation available for use and modification offline. I half-implemented a solution based on the wiki and git a while ago, but I stopped caring for a while. However it is another source of devleopment process friction.

Penultimately, most significant open source projects have at least one IRC channel. I'd recommend looking into establishing a #zotero-dev and #zotero channel, probably on freenode.

Finally, I think it would be a very good idea to start thinking about how to recruit Google summer of code students to develop components of the zotero developer ecosystem, as well as considering any other possible sources of funding.

I will get around to looking at the sync server at some point, to look at the feasibility of making it database server agnostic (so a lone dev can use it with sqlite for testing for example). This may or may not be a feasible goal.

Just a note on my credentials here. I've been involved in the Catalyst web framework[1] project for a number of years, where I am one of the people seen as in charge of the documentation. I wrote a book on it for the publisher Apress (Springer's tech book arm), and based on that I was the guest speaker at an international (to me) Perl conference where I talked about open source community issues, and documentation (separately). Nonetheless I'm a distinctly amateur programmer who dabbles from time to time. For my real job, at the moment I'm a PhD student doing work on the sociology of IT implementations in aged care homes, and I was unit coordinator for a subject at my institution teaching "Organisational issues in IT", although I've been doing research on organisations for a number of years, mainly with a focus on virtual organisations, and social/IT issues at work.

[1] Catalyst is a good example of a set of tools that as well as being optimised for developer convenience, scales to the very largest scope - e.g. as well as the single user research data management/analysis tools I write with catalyst, it also runs the BBC iPlayer and some other very large websites. It's rather more optimised for flexibility than the other major web frameworks (e.g. Rails, Django).

> --
> You received this message because you are subscribed to the Google Groups "zotero-dev" group.
> To post to this group, send email to zoter...@googlegroups.com.
> To unsubscribe from this group, send email to zotero-dev+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/zotero-dev?hl=en.
>

alex

unread,

Jun 21, 2010, 7:41:08 PM6/21/10

to zotero-dev

In response to Bruce's question, the main point is that a server
project as I described would help out some users who are currently out
of luck. Besides these users having different requirements I expect
the community would be more technically-inclined (they're interested
in running servers after all). So I see a place for a different
product, and a different development approach and culture to go with
it. (Not to belabor the obvious but there all sorts of cases where
software products are produced from common or overlapping sources.)
Anyhow, even if we contributed nothing back to the main line, the
people participating would have to discuss and document the server
design and APIs. That'd be a positive thing, and a chance to work
together.

To be honest though, the bigger part of it might just the attitude of
the people involved. I'm not just mouthing platitudes, I really do
want to be cooperative and collaborative. But I also like a more open
process. For example I seriously can't understand why the server
source was not public in the beta period, or why the project hasn't
publicized more detail about the platform it's running on. The people
who need or want to run their own servers had the same reaction, it
came up repeatedly in the user forums.

About metrics, actually I'm not quite sure what you mean by that. I'm
happy to have a discussion like this, here. But for the task as I've
defined it, I think we need a different approach. For example we need
a serious discussion of version management and build strategies - I
suspect that may be trickier than whatever coding is needed to get the
server up and running. In my view this is best accomplished in a new
forum and a new (yet to be defined) process and social dynamic. Of
course there has to be an open channel with zotero, I hope a wide-open
channel.

Anyhow I believe in open-source software - I believe it's not a zero-
sum game. Because of that I can two distinct communities and products
here, and see it not as competition or fragmentation but opportunity
and growth.

Besides, what's wrong with a little competition? (Strange for me to
say, since my politics are borderline socialist!) :-D

skornblith

unread,

Jun 22, 2010, 5:05:47 AM6/22/10

to zotero-dev

On Jun 21, 5:28 pm, Kieren Diment <dim...@gmail.com> wrote:
> I found working with zotero as a programmer rather frustrating until I started writing example driven documentation about the API, and found. As a user I'm very happy, although I'm happier now that I can add my own visualisations and queries independently of the Zotero gui - and I've been doing some quite sophisticated stuff with tags and a quasi-systematic literature review recently.
>
> I'd also probably echo some of Alex's concerns about the development process (I'm used to a very anarchistic model in the other open source code I'm involved with). However, I suspect that this is partially to do with the fact that zotero is primarily application code, and the anarchistic model seems less appropriate for this kind of code base (e.g. linux, open office), more suitable for the development of tools. However it's still a laudable aim to attempt to keep the barrier to entry for development in the zotero ecosystem as low as possible. How to achieve this?
>
> Well the community of developers around CSL shows that the anarchistic model certainly works well for a project with well defined aims where unit tests are easy to write. I haven't contributed because I haven't had the need, and besides, XML brings me out in a rash. Mind you, I do watch with interest.
>
> However, that's a side issue. What do we need to get the barrier to entry for Zotero as low as possible? The work I've done with zotero-browser is a start - it makes getting started with programming the zotero API as easy as PHP development. I haven't looked at baking zotero+POW+xulrunner together to provide a headless zotero repository, but that's got a lot of potential as an approach. However the POW firefox extension seems insufficiently robust, and the maintainer appears to lack the resources for getting a version that works on all platforms working with FF 3.6. This super-easy-development environment (which would be useful for development of bookmarklets, and prototyping the things to get zotero working outside of firefox) would be a valuable addition to zotero - but I suspect that resources would need to be put in to make it robust, and possibly better integrated with zotero (e.g. addition of convenience functions). This would be a great Google summer of code project.

We have plans to add an integrated web server in Zotero 2.1. At the
moment, I've been playing with the skeleton of the old webserver from
integration.js from Zotero 1.0, a reduced version of which now exists
in integration_compat.js for the sole purpose of telling users to
update their Word plug-ins. With a little bit of work, it could
probably be extended to do something like POW. If there's demand for a
more sophisticated web server, we could potentially use something like
Mozilla's httpd.js (see http://mxr.mozilla.org/mozilla-central/source/netwerk/test/httpserver/),
although to date I've shied away from this because the codebase is
>20x the size of our mini-server, and the ability to execute multiple
requests at once is not a priority for us at this time.

We already have Zotero running standalone in xulrunner, as part of the
standalone project, which is in the repository. While at the moment
we've only created a build script for OS X, in principle, there's no
reason it shouldn't be possible to create builds for every platform
xulrunner supports, and we plan on doing this in the future.

> What's this got to do with the server? Directly, not much. Indirectly, we need to describe the core tools for the zotero ecosystem from the perspective of a developer/administrator. As far as I can see, this comprises: The application, the headless xulrunner zotero, my easy dev environment, and the sync server.
>
> Providing these tools so that they scale from the smallest scope (lone developer) to the largest deployment (e.g. zotero.org) without too much effort on behalf of the end user is in my experience probably the best way to get an environment where the social/technical etc. issues become obvious, the process becomes low friction, and the barrier to entry is as low as possible. This low barrier to entry is of critical importance, as the user base for zotero is not your typical programmer, but more likely a researcher who happens to do some programming (like me).
>
> The other part of the process is while it's clear that the zotero application needs to be in some sense centrally controlled, the anarchistic model of open source contribution is a much better way to develop a vibrant ecosystem, so having tools in place that make that possible will massively enhance the potential of zotero's tools to have a major influence for the better in aspects of scholarly communication.
>
> A niggle I talked about before, which is more important from a programmer perspective than a user perspective is to have the documentation available for use and modification offline. I half-implemented a solution based on the wiki and git a while ago, but I stopped caring for a while. However it is another source of devleopment process friction.

I agree that this is a good idea, although it's not my department. For
us Zotero core developers, ever since XULPlanet disappeared, trying to
track down relevant Mozilla documentation is a constant annoyance, and
I wouldn't be surprised if others feel the same way about our current
documentation infrastructure.

Simon

skornblith

unread,

Jun 22, 2010, 5:27:22 AM6/22/10

to zotero-dev

> - Bryanhttp://heybryan.org/
> 1 512 203 0507
We've been thinking a little about this, but we think JavaScript is a
pretty good language for scraping data from webpages, and we don't
think it's feasible to rewrite all of our 250+ translators. One
possibility, which I have yet to explore in detail, use Aptana Jaxer
(http://jaxer.org/), a server-side JavaScript application server that
seems sufficiently fast and full-featured that we may be able to use
our existing translators without major modifications. There may also
be other alternative toolchains that could provide this kind of
functionality.

Simon

Bruce D'Arcus

unread,

Jun 22, 2010, 2:15:01 PM6/22/10

to zoter...@googlegroups.com

On Mon, Jun 21, 2010 at 7:41 PM, alex <alexuw.z...@gmail.com> wrote:

...

> About metrics, actually I'm not quite sure what you mean by that. I'm

> happy to have a discussion like this, here.

I'm asking a question larger than the particular projects: what
specific things do we need to see to declare what we do "more open"?
So "metrics" might be number of people with particular kinds of
responsibility, commit rights, etc. It might be particular policy
changes.

Bruce

alex

unread,

Jun 22, 2010, 2:55:05 PM6/22/10

to zotero-dev

Ah, I see where you're going Bruce. To over-simplify, seems like
there's a continuum of process from tight control to what Kieren aptly
terms the "anarchic" model. To de-simplify a little, there isn't one
continuum but multiple: continua of design method, development
process, management structures. And to future complicate things, even
continua of development tools and especially how they're employed
(e.g. how broadly commit privileges are granted). Personally I prefer
some fluidity here. Consider a project like zotero - must be
thousands more users than a couple years ago when I signed on.
Implementing the sync feature radically changed what it's about. With
changes like that usually process needs to evolve as well. Multiple,
interrelated projects, as has already happened with CSL is a natural
development as well.

OK, I admit it, I'm theorizing "metrics" not defining them. :-/
Partly because I think these things have to be hashed out by doing -
and discussing as well, but you can't just define them in the
abstract. My attitude is very situational: depending on who else is
involved, the nature of the problem, and other conditions, you choose
one point on the continuum vs. another.

Kieren Diment

unread,

Jun 22, 2010, 4:39:33 PM6/22/10

to zoter...@googlegroups.com

On 23/06/2010, at 4:55 AM, alex wrote:

> Ah, I see where you're going Bruce. To over-simplify, seems like
> there's a continuum of process from tight control to what Kieren aptly
> terms the "anarchic" model.

It's becoming clear to me that the GUI is not a particularly good place to apply the anarchistic model. XUL development is quite tricky, and somewhat fragile across firefox upgrades, so the skills required are quite specialised and high. The internationalisation doesn't really help either (I tried to patch a "list open url resolvers" link into the preferences screen a few days ago and got very lost due to the il8n stuff making it very difficult to grep the source tree. Openoffice.org also has probems with using the anarchistic model. Firefox doesn't because of the relative ease with which extensions can be developed, and the distribution mechanism (although harder than the typical skill set of a potential zotero developer I would suspect).

So my key point was that there need to be tools available to provide an easy dev environment by default - call it the zotero developer ecosystem or whatever. That way there's a way to encourage the development of the clearly superior anarchistic model while maintaining the independence of the GUI.

As an aside, I was doing a lecture on the management of open source projects the other week, and I was talking about the different contribution models. Normally the class was pretty quite, but a student asked if the evolution of low friction development processes in open source was caused by the development of the git version control software. To which the answer was that no, git was a symptom, not a cause, but that it seems to have a strong synergistic effect in lowering the friction of the development process even further.

> To de-simplify a little, there isn't one
> continuum but multiple: continua of design method, development
> process, management structures. And to future complicate things, even
> continua of development tools and especially how they're employed
> (e.g. how broadly commit privileges are granted). Personally I prefer
> some fluidity here. Consider a project like zotero - must be
> thousands more users than a couple years ago when I signed on.

I think I saw that Zotero has around 4 million users in another thread, which is pretty substantial.

> Implementing the sync feature radically changed what it's about. With
> changes like that usually process needs to evolve as well. Multiple,
> interrelated projects, as has already happened with CSL is a natural
> development as well.
>
> OK, I admit it, I'm theorizing "metrics" not defining them. :-/
> Partly because I think these things have to be hashed out by doing -
> and discussing as well, but you can't just define them in the
> abstract. My attitude is very situational: depending on who else is
> involved, the nature of the problem, and other conditions, you choose
> one point on the continuum vs. another.
>

Kieren Diment

unread,

Jun 22, 2010, 4:48:51 PM6/22/10

to zoter...@googlegroups.com

On 22/06/2010, at 7:05 PM, skornblith wrote:

> If there's demand for a
> more sophisticated web server, we could potentially use something like
> Mozilla's httpd.js (see http://mxr.mozilla.org/mozilla-central/source/netwerk/test/httpserver/),
> although to date I've shied away from this because the codebase is
>> 20x the size of our mini-server, and the ability to execute multiple
> requests at once is not a priority for us at this time.

It's not really demand. I've demonstrated the strong potential, but POW seems too fragile and unmaintained. A good quality web server as a separate plugin with some zotero convernience functions would be useful. I wouldn't baulk at httpd.js - it's not heavy like apache for example ;). In fact it looks about the same size and scale as the production-grade-as-a-reverse-proxy-or-low-user-web-server that I use as a perl development environment.

Richard Karnesky

unread,

Jun 23, 2010, 3:10:47 PM6/23/10

to zotero-dev

While I agree that some people interested in the server code have
different priorities, I do not agree that these are incompatible. Why
fork from an existing infrastructure that encourages tight-binding to
the Zotero client and has strong (internal) development?

Unless the Zotero team decides they will refuse useful code
submissions or concrete development plans, I don't think a fork will
be useful. This is not like the citeproc code that Frank developed.
With citeproc-js, code could be tested and can be used outside of
zotero, was written from scratch, and with some plan for merging
later. But the server is more dependent on the zotero client in the
immediate future & I see no plans for you to start from scratch.

> Finally, I'll be honest, I'm not comfortable with
> zotero's development process - IMO too opaque and tightly-controlled.

> ....

> I've decided to take the initiative myself, and am setting it up now.

A single part-time developer fork of part of the project does not
address the issues you see with the zotero development process. I
echo Bruce's question: Do you have concrete plans for your immediate
contributions to development & what (if anything) will be submitted
back to the CHMC project (and how)? What does zitation.org offer that
zotero.org does not?

You mention better openness and better collaboration are goals, but
why are those impossible to achieve with zotero.org & what specific
points can you claim that your new project will definitely be better?
If you feel that the development process, itself, is less-than-ideal,
why are you focusing on the server & not also the client?

I'm personally inclined to use the zotero.org infrastructure until
zitation.org demonstrates any advantages.

--Rick

alex

unread,

Jun 23, 2010, 9:04:25 PM6/23/10

to zotero-dev

Richard, people have expressed a desire to run their own servers.
It's a show-stopper for some. The zotero organization has said they
won't offer support for the server. The purpose of a new project is
to support those users.

I don't know why you're implying it's a solo thing, I've stated very
clearly I think it will only work with the help of others. Time will
tell. And I've talked about collaboration and contributing back to
zotero because I really believe the two efforts are complimentary.
Your comments sound as if I'm obligated to quantify the benefits to
zotero, precisely how and when. I'm not.

alex

unread,

Jun 23, 2010, 2:14:55 PM6/23/10

to zotero-dev

As I said, I'm starting another forum for those who want to run the
server themselves. I've put up a site at http://www.zitation.org/ My
hope is that this and zotero will both thrive, in complimentary ways.

Sean Takats

unread,

Jun 23, 2010, 11:07:37 PM6/23/10

to zotero-dev

The position of CHNM with respect to the development and deployment of
the Zotero server is being somewhat distorted. It is true that CHNM
does not currently plan to offer technical support for custom server
deployment any more than it offers technical support for custom Zotero
client deployment. Our grant funding and our research interests
dictate this focus. However, that does not mean that community
development of the server code is somehow antithetical to the mission
of Zotero. This isn't something that we don't want to see happen. It's
just not something that we're going to devote resources to right now.
To my mind, it does not logically follow that server development ought
to move elsewhere, but hey, we made it free software, so nothing's
stopping anyone. And by "we" I don't mean CHNM. I mean the hundreds of
developers and thousands of forum contributors and countless users
that have worked in and around zotero.org to make Zotero what it is
today. -Sean

Reply all

Reply to author

Forward