Hi all,
Please welcome Markus Koschany (Cc:), who has kindly accepted to work on
improving our Linux packaging. As a Debian developer he will
specifically work on packaging OpenRefine for inclusion in the official
Debian repository, which should eventually make the tool available in
many Debian derivatives. Markus will be carrying out this work as a
Freexian contractor
(
https://www.freexian.com/en/services/debian-packaging.html), funded by
our CZI-EOSS grant.
Beyond the benefit of simplifying the installation process for users
greatly, the hope is that this project will also help us adopt better
practices as maintainers, for instance to migrate out of non-free or
obsolete dependencies (such as the org.json migration done a few years
ago) or fix security vulnerabilities more swiftly.
Indeed, Debian has pretty strict requirements concerning the packaging
of dependencies, the handling of non-free code and other packaging
topics. This should hopefully be a useful follow-up to our previous call
for projects on tackling technical debt.
One major question is which branch he should start working from. We
could aim to package the master branch (so, the 3.x series) or directly
the new-architecture branch (which should become the 4.x series).
I am personally edging towards 4.x for the following reasons:
- Debian's release cycles are not so quick, so if we work on 3.x the
risk is that when it reaches users in Debian stable, it is already
outdated and superseded by stable 4.x releases
- the current version of Jetty used in 3.x (6.1.26) is very old and not
available in Debian anymore. In the new architecture I have had to
migrate to Jetty 9 already (to avoid dependency conflicts with Spark),
so that is one less obstacle. We could try to do backport this migration
in 3.x but it could potentially create conflicts with extensions (and
generally speaking it is not clear to me if it is safe doing so now as
we are thinking of cutting out a 3.5 release).
- more generally, since the new architecture has not been released yet,
we can afford to make pretty arbitrary breaking changes there to comply
with Debian's guidelines. For instance, say we discover a non-free
dependency somewhere (similar to org.json), we can afford to get rid of
it quickly.
- the new architecture uses Maven modules more, which should make it
easier to maintain extensions outside our code base. This is relevant
for packaging, since we should make it possible for extensions to be
packaged independently too.
That being said, since the new architecture is still in its infancy, I
can totally understand if people prefer to ship the 3.x series instead.
Let me know what you think!
Antonin