pk <
phil...@kime.org.uk> wrote:
> and all the seemingly extraneous stuff like accessing remote
> data is a few lines of code and not bloated at all.
It is perhaps not bloated within the biber source-code, but it is
the main cause of dependency (libwww-perl) and security issues.
> External XML/XSLT libraries are in there, yes. This is much
> better than relying on the user having them (and the right version).
This is true on incomplete systems (like windows) without a
proper package management but not on e.g. linux distributions
which can properly add correct dependencies.
(If a package manager cannot manage dependencies on correct
version ranges, it is broken.)
> I can say for certain than before I packaged it as a "binary"
> (more on this below), the user base was about 1% of what it is now.
No doubts about that: The majority of users certainly is
windows users who would not even consider installing
perl and its dependencies.
> It's strange to say that "dependency hell" isn't really solved
> by pre-compilation. It is. It really does solve it.
By replacing it by other problems.
For instance, under Unix, the unpacker uses predictable names
in world-writable directories which is a clear "don't do ever"
from a security viewpoint (symlink attacks, race conditions, etc).
Such security problems cannot be avoided if you want packaging
without using the system's means for it.
As already mentioned there are much more problems with pre-packaging
(e.g. security fixes in pre-packaged libraries would need
much too long to reach the user of a linux distribution as it
involves you as an intermediate packager).
Pre-packing is an ad-hoc solution, probably necessary for the vast
majority of windows users, but a nightmare for those who want to use
a sane safe system and not a "but it is sufficient to work" hack.
A reasonable linux distribution will only include the perl sources.
(Thus, e.g., currently users of Gentoo-based distributions cannot use
biber, because perl-5.14 is there not stable yet. However, if you do
not rely in higher versions of perl in future releases, it is of
course only a question of some time until this problem solves itself.)
> There aren't that many "gimmick" modules.
PC-Run3, IPC-Cmd, List-AllUtils, Readonly, Reaonly-XS,
Data-Dump, Data-Compare, Date-Simple, ...
with lots of implicit dependencies. E.g. IPC-Cmd pulls in
(directly or indirectly):
Module-Load, Module-CoreList, Module-LoadConditional,
Params-Check, Locale-Maketext-Simple
All in all, I had to install about 60 perl modules which
were previously not needed by other perl programs (and I have
several of those). Sure, most of them are small, but the mere
number is enormous. The main bulks are of course things like
libwww.
> We have to have Text::BibTeX as this parses .bib files.
Sure.
> We need XML libraries and modules for certain as we need a modern,
> standard data interchange format between biblatex and biber.
I am probably not the only one who believes that XML is a wrong choice
for an intercommunication format between TeX and an external program:
TeX is not natively related with XML. Moreover, biblatex and biber are
tied by their nature anyway, so a "standard" format is not important:
Why using a bloated format like XML if perl can simply parse any
format (and a more straightforward solution would actually be more
convenient and powerful)?
However, this is a different field of discussion, I do not want to
go into.
> It really is more complex that just reading some files and outputting
> a .bbl. You should look at the uniqueness code in biber (see section
> 4.5.2 and4.11.4 to get an idea of this). Again, there is nothing like
> this on the market at all - it's very complex.
Just this point would not require interaction with external libraries,
so I do not see the relevance here.
Do not confuse complexity of source code with complexity of
dependencies. It is only the latter about which I complained.
> Flexible data inheritance, on-the-fly source mapping etc.
> can't be done by an external tool as it depends on live state of the
> entry objects.
As long as "entry objects" are defined to be contained in the
local database this is all fine (and biber is only about reading
some local files and outputting others, as I stated).
Otherwise (for non-local database) what you mention is exactly
what should not be the case IMHO:
Crossreferences etc. are fine if they refer to the *local biber*
database. If they do not, it should not be the task of biber to
resolve them.
IMHO it is fine to have *additional* external programs (some, for
proprietary databases, might even be proprietary), presumably with a
documented query format, to add crossreferences to the local biber
database from various sources. Maybe one can also add a convenience
wrapper for these programs, but this wrapper should IMHO not be
biber itself: It is not necessary or even desirable to do this in one
pass (or even worse in one program).
Separating these tasks (filling the database and using the database)
separates the dependencies (onto various database/internet backends,
as mentioned, some perhaps even proprietary) and keeps the actual
biber program reliable and secure.
This is also desirable from the user's point of view: Just think
about the author's nightmare that a book actually published
contains a different bibliography than he intended, because
an online database entry was suddenly changed to contain a different
crossreference. Or even worse, if an online database gets hacked
and might then do unexpected things on your machine (in the
best case only to your references).
Summarizing, there is really no need to resolve references to
the non-local database within one pass (or one program).
> Yes, it could theoretically be written in C or lua.
I have not suggested to do this. A high-level language like perl
for a high-level task is appropriate. Even perl-5.14 if it is
really necessary. This would only be a problem if this requirement
changes to perl-5.16 in a few days when it is out and then in
a year or so to perl-5.18, ... Depending on bleeding-edge once
if it really cannot be avoided is one thing and decreases the
user-base only until the former "bleeding-edge" has become a
standard; requiring bleeding-edge permanently keeps the
user-base permanently small ;)