Biblatex: Uniquely truncating author name lists?

phil...@gmail.com

unread,

Oct 19, 2008, 5:28:17 PM10/19/08

to

I know that uniquename serves to help disambiguate similar author
names across citations but what about author name lists? For example,
If I have maxnames=5 and minnames=1 and I have these two .bib entries:

@BOOK{6names,
AUTHOR = {Alfred Armstrong and Bertie Butterford and Cecil
Cuthbertson and Dickie Dartmouth and Eric Eroldson and Freddie
Foundlake},
PUBLISHER = NEP,
ADDRESS = GA,
TITLE = {Test 6},
YEAR = {2006}
}

@BOOK{6names2,
AUTHOR = {Alfred Armstrong and Graham Grimstead and Bertie
Butterford and Cecil Cuthbertson and Dickie Dartmouth and Eric
Eroldson},
PUBLISHER = NEP,
ADDRESS = GA,
TITLE = {Test 7},
YEAR = {2006}
}

That's two separate authors really, but in the same year. In
authoryear style with labelyear set, a \textcite of these two results
in:

Armstrong et al. (2006a)
Armstrong et al. (2006b)

Which isn't right as it looks like the authors are the same and they
wrote two papers in the same year. I suppose it should really truncate
to:

Armstrong et al. (2006)
Armstrong, Grimstead, et al. (2006)

Which is that the APA wants (and I'm trying currently to write a
decent APA biblatex style).
It seems that there is no way to do this with current Biblatex
options? I just want to make sure before I tackle what looks to be a
tricky problem if it's to be solved in the general case of
disambiguating name lists that truncate to the same namehash.

Philipp Lehman

unread,

Oct 20, 2008, 9:47:25 AM10/20/08

to

phil...@gmail.com wrote:

> I know that uniquename serves to help disambiguate similar author
> names across citations but what about author name lists?

Not yet supported but on the wishlist. It will probably make it into
0.9.

> That's two separate authors really, but in the same year. In
> authoryear style with labelyear set, a \textcite of these two
> results in:
>
> Armstrong et al. (2006a)
> Armstrong et al. (2006b)
>
> Which isn't right as it looks like the authors are the same and they
> wrote two papers in the same year. I suppose it should really
> truncate to:
>
> Armstrong et al. (2006)
> Armstrong, Grimstead, et al. (2006)

Shouldn't that be:

Armstrong, Butterford, et al. (2006)

Armstrong, Grimstead, et al. (2006)

instead?

> Which is that the APA wants (and I'm trying currently to write a
> decent APA biblatex style).

> It seems that there is no way to do this with current Biblatex
> options? I just want to make sure before I tackle what looks to be a
> tricky problem if it's to be solved in the general case of
> disambiguating name lists that truncate to the same namehash.

I don't think you can solve that in a style. You need support in the
core package because the labels are generated by Bibtex.

BTW: What the APA rule if the number of names required to disambiguate
the citation exceeds maxnames (e.g., you have maxnames=5 and two
entries with 6 authors where only the 6th author differs)?

--
Sender address blackholed; do not reply to From: address.
You can still reach me by email at: plehman gmx net.

phil...@gmail.com

unread,

Oct 20, 2008, 12:46:49 PM10/20/08

to

On Oct 20, 3:47 pm, Philipp Lehman <devnull.1.leh...@spamgourmet.com>
wrote:

> Shouldn't that be:
>
> Armstrong, Butterford, et al. (2006)
> Armstrong, Grimstead, et al. (2006)
>
> instead?

Yes, quite right, my mistake.

(I just sent you a mail about this as I think it's probably a feature
request, not a simple style question)

> I don't think you can solve that in a style. You need support in the
> core package because the labels are generated by Bibtex.

I didn't think so after I had a few aborted attempts at it. I sent you
a feature-request. You can turn off maxnames by setting it to
something large and then try doing things yourself in the labelname
format but it won't work in general and it's messy. When I realised
that maxnames and minnames are dealt with by biblatex.sty, I gave up.

> BTW: What the APA rule if the number of names required to disambiguate
> the citation exceeds maxnames (e.g., you have maxnames=5 and two
> entries with 6 authors where only the 6th author differs)?

Their rules are (section 3.95 APA style manual):

Author lists of 3-5 authors are cited fully first time, truncated with
"et al." thereafter except in such cases of ambiguity where you are to
add names "as many as necessary" to disambiguate so if this means
giving up the truncation completely, so be it. In cases of 6 authors
or more, it's supposed to be always truncated (which would mean
maxnames=5, minnames=1) but again, there is the same disambiguation
requirement ("as many as necessary"). So, if disambiguation requires
ignoring max/minnames, that's ok I think.

Implementing this would also fix the problem I mentioned above as you
could make labelyear only use fullhash and not namehash so that
labelyear was only used for different works in the same year by
exactly the same group of authors, which is how it should be I think.

This is like a "non-strict" minnames I suppose but isn't trivial to do
because you have to basically determine such truncation ambiguity
after you have all of the citations to compare.

PK

phil...@gmail.com

unread,

Oct 20, 2008, 2:19:39 PM10/20/08

to

I also note that, on the topic of disambiguation, uniquename doesn't
work with multiple authors?

@BOOK{2names,
AUTHOR = {Alfred Armstrong and Bertie Butterford},

PUBLISHER = NEP,
ADDRESS = GA,

TITLE = {Test 2},
YEAR = {2002}
}

@BOOK{2names2,
AUTHOR = {Peter Armstrong and Bertie Butterford},

PUBLISHER = NEP,
ADDRESS = GA,

TITLE = {Test 2},
YEAR = {2003}
}

\textcite{2names}
\textcite{2names2}

Armstrong and Butterford (2002)
Armstrong and Butterford (2003)

or with the same year and labelyear enabled:

Armstrong and Butterford (2002a)
Armstrong and Butterford (2002b)

Which again, is misleading as the authors aren't the same. With
uniquename=init, should be:

A. Armstrong and Butterford (2002)
P. Armstrong and Butterford (2002)

and

A. Armstrong and Butterford (2002)
P. Armstrong and Butterford (2003)

right?
I'm not trying to find faults here - biblatex is a really superb
package and this conceptually the same type of issue as the first one
I mentioned - disambiguation of author lists, saving labelyear for
identical lists publishing in the same year.

Philipp Lehman

unread,

Oct 20, 2008, 3:22:55 PM10/20/08

to

phil...@gmail.com wrote:

> I also note that, on the topic of disambiguation, uniquename doesn't
> work with multiple authors?

Indeed.

> Which again, is misleading as the authors aren't the same. With
> uniquename=init, should be:
>
> A. Armstrong and Butterford (2002)
> P. Armstrong and Butterford (2002)
>
> and
>
> A. Armstrong and Butterford (2002)
> P. Armstrong and Butterford (2003)
>
> right?

Right, conceptually speaking. Alas, there are technical issues, too.

> I'm not trying to find faults here - biblatex is a really superb
> package and this conceptually the same type of issue as the first
> one I mentioned - disambiguation of author lists, saving labelyear
> for identical lists publishing in the same year.

Your point is perfectly clear. It's just that disambiguation of
individual author names in a list of names is beyond what you can
handle using Bibtex's rather limited stack language (well, at least
in a sensible way). I'm afraid 'uniquename' is not likely to be
extended.

In this case, 'labelyear' is the only way to make sure that all
citations are unique, even if the cited items are not authored by
exactly the same group of people. That's second best only, no doubt
about that, but it's better than no disambiguation at all.

phil...@gmail.com

unread,

Oct 20, 2008, 5:45:39 PM10/20/08

to

On Oct 20, 9:22 pm, Philipp Lehman <devnull.1.leh...@spamgourmet.com>
wrote:

> In this case, 'labelyear' is the only way to make sure that all

> citations are unique, even if the cited items are not authored by
> exactly the same group of people. That's second best only, no doubt
> about that, but it's better than no disambiguation at all.

Do you think it would be possible by moving that code to TeX by
ignoring max/mainnames at the BibTeX level and doing something like
this in the labelname format (this is oviously not general enough - I
was just playing with it for the APA 3-5 name case. labelname:doname
just does what the original labelname format does):

\DeclareNameFormat{labelname}{%
\ifthenelse{\NOT\value{listcount}=1\AND \value{listtotal}>2\AND
\value{listtotal}<6}
{\ifciteseen
{\ifnum\value{listcount}=2 \andothersdelim\bibstring{andothers}
\fi
\ifnum\value{listcount}=3 \relax\fi
\ifnum\value{listcount}=4 \relax\fi
\ifnum\value{listcount}=5 \relax\fi
}
{\usebibmacro{labelname:doname}{#1}{#2}{#3}{#4}{#5}{#6}{#7}{#8}}
}
{\usebibmacro{labelname:doname}{#1}{#2}{#3}{#4}{#5}{#6}{#7}{#8}}
}

You could then put in here code to save mappings of entrykeys to
namehashes to see if there were identical hashes for different keys
later on. If there were, you could expand the ambiguous name lists.
I'm only guessing here as I don't have the expertise to do this
without a great deal of pain. I just thought that since the good thing
about biblatex is that you can leverage LaTeX and TeX for the code.
I'm probably being naive here but I have an almost complete APA
citation style apart from these parts ...

An alternative is perhaps a manual way to specify what maxmanes/
minnames should be per citation so users can disambiguate manually?
And a strict/nostrict labelyear option because at the moment, it does,
as you point out, two different jobs and that would really create an
invalid APA paper if it were used in both senses (disambiguating same
authors in same year and different truncated authors).

PK

Charles de Miramon

unread,

Oct 20, 2008, 6:16:47 PM10/20/08

to

phil...@gmail.com wrote:

> Do you think it would be possible by moving that code to TeX by
> ignoring max/mainnames at the BibTeX level and doing something like
> this in the labelname format (this is oviously not general enough - I
> was just playing with it for the APA 3-5 name case. labelname:doname
> just does what the original labelname format does):
>

Since I have started using biblatex, I was wondering how difficult it would
be to code a quick replacement of bibtex for biblatex in Python or any
other language.

The blg files produced by biblatex are very simple and sorting and working
on strings could be done in a much easier way in any modern interpreted
language. One could also easily add what is missing in bibtex (unicode, a
better crossreferencing).

Just thinking aloud.

Cheers,
Charles
--
http://www.kde-france.org

phil...@gmail.com

unread,

Oct 21, 2008, 4:50:52 AM10/21/08

to

On Oct 21, 12:16 am, Charles de Miramon <cmira...@nerim.net> wrote:

> Just thinking aloud.

It's a good point and one which others have started on I think - have
you seen CrossTeX?

http://www.cs.cornell.edu/People/egs/crosstex/

I don't think it uses BiBTeX at all and I think it's all in python. It
was more of an attempt to replace the .bib format though, as far as I
can tell - it's not as fully featured as Biblatex. I also wonder what
impact LuaTeX in general will have on such problems although if it's a
BibTeX problem, probably not much.

phil...@gmail.com

unread,

Oct 21, 2008, 4:53:52 AM10/21/08

to

On Oct 21, 10:50 am, philk...@gmail.com wrote:
> On Oct 21, 12:16 am, Charles de Miramon <cmira...@nerim.net> wrote:
>
> > Just thinking aloud.
>
> It's a good point and one which others have started on I think - have
> you seen CrossTeX?
>
> http://www.cs.cornell.edu/People/egs/crosstex/

Hmm, you know what, CrossTeX might work ... if would be good if
Philipp could comment?

Robin Fairbairns

unread,

Oct 21, 2008, 6:06:39 AM10/21/08

to

phil...@gmail.com writes:

>On Oct 21, 12:16=A0am, Charles de Miramon <cmira...@nerim.net> wrote:
>
>> Just thinking aloud.
>
>It's a good point and one which others have started on I think - have
>you seen CrossTeX?
>
>http://www.cs.cornell.edu/People/egs/crosstex/

i was impressed by it, and persuaded them to let us hold it on ctan.
mind you, i've not actually used it.

>I don't think it uses BiBTeX at all and I think it's all in python.

correct.

>It
>was more of an attempt to replace the .bib format though, as far as I
>can tell - it's not as fully featured as Biblatex. I also wonder what
>impact LuaTeX in general will have on such problems although if it's a
>BibTeX problem, probably not much.

the problem is already with us: luatex is not the only engine that has
utf-8 as default language.

we need something (i don't think crosstex is there yet -- i know it
didn't grok utf-8 when i looked last) to do what xindy did (and seems
to be continuing to do -- there was a paper at the tug conference
about developments).
--
Robin Fairbairns, Cambridge

Kjell Magne Fauske

unread,

Oct 21, 2008, 8:47:40 AM10/21/08

to

On Oct 21, 10:50 am, philk...@gmail.com wrote:

CrossTeX looks really interesting. Thanks for the link.

There is also a project called pybtex:

http://pybtex.sourceforge.net/

It claims to be a drop-in replacement for BibTeX written in Python. I
tried it a long time ago but could not get it to work properly. Its
BibteX parser was also too slow for large bib files. Things may have
changed since then.

- Kjell Magne Fauske

Charles de Miramon

unread,

Oct 21, 2008, 8:54:21 AM10/21/08

to

pybliographer has also a command line bib parser, sorting mechanism, etc.

http://arch.pybliographer.org/documentation/

Brent Lievers

unread,

Oct 21, 2008, 9:16:00 AM10/21/08

to

Charles de Miramon <cmir...@nerim.net> wrote:
> Since I have started using biblatex, I was wondering how difficult it would
> be to code a quick replacement of bibtex for biblatex in Python or any
> other language.
>
> The blg files produced by biblatex are very simple and sorting and working
> on strings could be done in a much easier way in any modern interpreted
> language. One could also easily add what is missing in bibtex (unicode, a
> better crossreferencing).

I've been playing around with doing bibliographies in python (which I've
been calling bippy) as a hobby project. It's on hiatus at the moment
because I'm concentrating on finishing up my PhD. Also the code and data
format are in need of some major structural changes.

The latter is XML-based, ergo UTF-8 internally. I've posted a PDF
containing formatted entries (AMA, APA, MLA) and a sample input file:

http://qlink.queensu.ca/~3wbl/CommonExamples.pdf

in case anyone is interested. It allows just about anything to be
crossreferenced: journals, authors, even other articles (see examples
37-41).

As I said, it is a project in forced hibernation pending my defence. But
if people have any comments, or can suggest a particular test case where
bibtex is inadequate, I'd appreciate hearing them. Just as long as you
know that I won't act on your comments for a couple of months ;-)

Brent

phil...@gmail.com

unread,

Oct 21, 2008, 12:47:10 PM10/21/08

to

Well, I didn't realise how many other projects there were attempting
to replace BibTeX - all very interesting. I wonder if PL could comment
on what he requires from a BiBTeX replacement to account for the sorts
of string manipulations required to fully satisfy the, for example,
APA requirements? It looks like a new BibTeX replacement backend is
the way to go here so that the really excellent LaTeX based front end
which BibLaTeX implements isn't held back by the rather basic and old
BibTeX stack language.

Philipp Lehman

unread,

Oct 21, 2008, 1:41:10 PM10/21/08

to

phil...@gmail.com wrote:

> Do you think it would be possible by moving that code to TeX by
> ignoring max/mainnames at the BibTeX level and doing something like
> this in the labelname format (this is oviously not general enough -
> I was just playing with it for the APA 3-5 name case.
> labelname:doname just does what the original labelname format does):

The problem with this approach is that list truncation affects
the 'labelyear' field. If you handle the truncation on the Latex
side, labelyear will be out of sync. It also affects the namehash and
some other things which are handled by biblatex.bst.

Philipp Lehman

unread,

Oct 21, 2008, 1:42:01 PM10/21/08

to

Charles de Miramon wrote:

> Since I have started using biblatex, I was wondering how difficult
> it would be to code a quick replacement of bibtex for biblatex in
> Python or any other language.
>
> The blg files produced by biblatex are very simple and sorting and
> working on strings could be done in a much easier way in any modern
> interpreted language. One could also easily add what is missing in
> bibtex (unicode, a better crossreferencing).

There's a prototype of a Bibtex replacement for biblatex on
www.soureforge.net. Search for "biblatex-biber". It's been some time
since I talked to François about it but I think full Unicode support
is already working.

It is pretty obvious that the next major step in biblatex development
would be replacing Bibtex by a tool that is designed to work with
biblatex from the ground up. But that's a longer-term project. It's
something a single person can't pull off. I don't plan to take this
step before biblatex 1.0 is out of the door. The idea is to stick
with the current Bibtex-based architecture and finish 1.0. After
that, we'll have to see. This implies that features which can't be
implemented in BST files in a sensible way (but must be handled on
the Bibtex side of the workflow) will have to wait until after 1.0.

Of course it all depends on how much input I get from others. I don't
have the time to handle both the Latex side and work on a Bibtex
successor. If someone (or a group of people) is willing to work on a
Bibtex successor that closely interacts with the Latex components of
biblatex, this could bring biblatex to a new level (Unicode support,
cross-referencing configurable per style, sorting schemes
configurable per style, custom labels, additional "lists of X" such
as a list of journal abbreviations sorted by abbreviation...).
Essentially, everthing which is handled on the "Bibtex side" should
be configurable in a style.

It's not only Unicode support that is missing. We need a completely
different data model which includes a real concept of a list (instead
of text fields with funny "and" separators), supports nested data
structures, etc. We should also switch to XML.

phil...@gmail.com

unread,

Oct 21, 2008, 2:07:05 PM10/21/08

to

On Oct 21, 7:41 pm, Philipp Lehman <devnull.1.leh...@spamgourmet.com>
wrote:

> The problem with this approach is that list truncation affects

> the 'labelyear' field. If you handle the truncation on the Latex
> side, labelyear will be out of sync. It also affects the namehash and
> some other things which are handled by biblatex.bst.

Right, I think I now appreciate more where the division of labour is
between BiBTeX and Biblatex which means that your previous post about
a BiBTeX replacement is really the issue.

phil...@gmail.com

unread,

Oct 21, 2008, 2:32:59 PM10/21/08

to

On Oct 21, 7:42 pm, Philipp Lehman <devnull.1.leh...@spamgourmet.com>
wrote:

> It's not only Unicode support that is missing. We need a completely

> different data model which includes a real concept of a list (instead
> of text fields with funny "and" separators), supports nested data
> structures, etc. We should also switch to XML.

I had a quick look at the biblatex-biber code (it's in perl, which is
a good thing ...) and it hasn't been touched for quite a few months.
When you say you'd like something which "closely interacts" with the
biblatex LaTeX layer, what more do you mean that something with a
decent XML data model which reads aux files and spits out the right
bbl files? I assume that you would want a different .bbl file format
or would you want some sort of custom intermediate file format
completely other than .bbl?

Gerd Neugebauer

unread,

Oct 22, 2008, 2:13:44 AM10/22/08

to

Philipp Lehman wrote:
> Charles de Miramon wrote:
>
>> Since I have started using biblatex, I was wondering how difficult
>> it would be to code a quick replacement of bibtex for biblatex in
>> Python or any other language.

>> The blg files produced by biblatex are very simple and sorting and
>> working on strings could be done in a much easier way in any modern
>> interpreted language. One could also easily add what is missing in
>> bibtex (unicode, a better crossreferencing).
>
> There's a prototype of a Bibtex replacement for biblatex on
> www.soureforge.net. Search for "biblatex-biber". It's been some time
> since I talked to François about it but I think full Unicode support
> is already working.

I have written a BibTeX replacement in Java which I have donated to the
ExTeX project. It is called ExBib. It has Unicode support and enhanced
sorting features. I am in the course of completing the documentation and
test suite before a first public release. Nevertheless the code is
pretty complete and I am using it to process bibliographies for ExTeX
already.

Beside I have started writing a compiler from this nasty BST language to
Groovy. This is also a step towards a more modern language in this context.

> Of course it all depends on how much input I get from others. I don't
> have the time to handle both the Latex side and work on a Bibtex
> successor. If someone (or a group of people) is willing to work on a
> Bibtex successor that closely interacts with the Latex components of
> biblatex, this could bring biblatex to a new level (Unicode support,
> cross-referencing configurable per style, sorting schemes
> configurable per style, custom labels, additional "lists of X" such
> as a list of journal abbreviations sorted by abbreviation...).
> Essentially, everthing which is handled on the "Bibtex side" should
> be configurable in a style.

This sounds like the ideas I am following in ExBib. I have introduced a
primitive \biboptions which can be placed in the aux file to pass
options to ExBib. This mechanism is in place and working for the sorting
order specification. Making other aspects configurable should be rather
straight forward.

> It's not only Unicode support that is missing. We need a completely
> different data model which includes a real concept of a list (instead
> of text fields with funny "and" separators), supports nested data
> structures, etc. We should also switch to XML.

For ExBib i have experimented with different input languages. For
backward compatibility my focus is on the classical BibTeX type of
files. Nevertheless I also have worked on an XML and on a lisp parser
(and prettyprinter). Thus playing around with the syntax should not be a
big deal.

If you are interested in ExBib I am willing to cooperate to make it more
useful and more widely used.

Ciao
Gerd

Simon Spiegel

unread,

Oct 22, 2008, 11:26:23 AM10/22/08

to

On 2008-10-21 13:42:01 -0400, Philipp Lehman
<devnull....@spamgourmet.com> said:

I'm sure you already know this, but there is a lot of work going on
with CSL, the Citation Style Language (http://xbiblio.sourceforge.net/
) developped mainly by Bruce D'Arcus and already in use by Zotero
(http://www.zotero.org/ ). From my understanding CSL aims at something
very similar like biblatex, providing a means to costumize complex
citation formats. I don't know if the two projects can actually benefit
from each other since biblatex is firmly rooted in the LaTeX world
while CSL has more generic approach. But with the success of Zotero and
the upcoming version 1.2 of the OpenDocument format which will offer
support for CSL-like styling, CSL is definitely something to watch.

There is also Bibliographic Ontology Specification
(http://bibliontology.com/ ). I must confess that I even less
understand what this project tries to do, but I think it provides a
data model for bibliographic data. So this might actually be something
a post version-1.0-biblatex might use.

simon

Philipp Lehman

unread,

Oct 23, 2008, 10:50:36 AM10/23/08

to

phil...@gmail.com wrote:
> I had a quick look at the biblatex-biber code (it's in perl, which
> is a good thing ...) and it hasn't been touched for quite a few
> months. When you say you'd like something which "closely interacts"
> with the biblatex LaTeX layer, what more do you mean that something
> with a decent XML data model which reads aux files and spits out the
> right bbl files? I assume that you would want a different .bbl file
> format or would you want some sort of custom intermediate file
> format completely other than .bbl?

First and foremost, I'd want a decent Latex->Bibtex interface! Note
the arrow; the bbl file is a Latex<-Bibtex interface. This interface
is defined in biblatex.bst hence I have complete control over its
format. The bottleneck is the Latex->Bibtex step because that's
hard-coded in the Bibtex binary.

Let me start by giving a short overview of the current interface.
Latex writes two types of information to the aux file: \bibstyle and
\bibdata. With biblatex, the \bibstyle is always biblatex.bst so the
run-time Latex->Bibtex communication boils down to \bibdata. In other
words: the only message I can send to the database frontend at
run-time is "I want to cite X"! That's all. You could say that Bibtex
is a somewhat autistic piece of software.

I'm forced to use a pseudo bib file to pass some control parameters to
biblatex.bst. Instead of

@Article{foo,
author = {John Doe}
...
}

this file reads something like

@Control{biblatex-control,
ctrl-options = {0.8:0:0:0:0:1:1:0:1:0:1:2:1:3:1:79:+},
}

While this works pretty well, passing control parameters through the
data interface is plain ridiculous from an architectural standpoint.
However, it's the only way to at least pass some parameters at run
time. This interface is limited to global settings, though. There is
no way to attach parameters to a single request (such as: "only
provide the data for this entry but ignore it when generating labels
and sorting the bibliography") or to a class of entries (such
as: "exclude all @online entries from the bibliography").

But there is more. Everything (well, almost everything) which is
handled by the database backend should be configurable on the style
level. Essentially, biblatex.sty would create some kind of
<jobname>.cfg file which configures the backend.

Let's look at a seemingly simple case (based on a feature request by a
user): suppose that some style guide mandates that edited collections
are to be sorted and cited by title rather than by editor. Sounds
trivial? Can't be done in a style. In a style, you can only set
useeditor=false globally but that's not what's required here. All you
can do in this case is to set it locally in each @collection entry --
which renders the bib file unportable because you're hard-coding
style requirements in the data file. What would it take to implement
that in a style?

Simple solution: add support for per-type options. The style would
declare:

\ExecuteEntryOption{collection}{useeditor=false}

and biblatex would write that to the aux or to some <jobname>.cfg
file. This is sufficient to handle the case at hand but only because
there already is an option designed for this particular case (it's
just that you can't set it on a per-type basis). In other cases, a
more generic approach may be required so let's consider that.

The issue at hand is related to the sorting order of the bibliography
and to citations (the labelname field in this case). It requires 1)
configurable sorting schemes (i.e., the 'schemes' discussed in section
3.4 of the manual) and 2) configurable fallback chains
for 'labelname'. We'd need:

1) An interface to declare sorting schemes. For example, the
standard "nty" scheme would be declared like this (as a default
setting in biblatex.def):

\DeclareSortingScheme{nty}{*}{
\item{presort}
\item{sortname,author,editor,sorttitle,title}
\item{sorttitle,title}
\item{sortyear,year}
}

The style would override the scheme on a per-type basis:

\DeclareSortingScheme{nty}{collection}{
\item{presort}
\item{sorttitle,title}
\item{sortyear,year}
}

Biblatex would write this information to the <jobname>.cfg file
(probably in some other format). Currently, this is all
hard-coded in biblatex.bst.

2) The second missing link is the labelname field. This would also
require a configuration interface. Default:

\DeclareLabelScheme{labelname}{*}{
\item{shortauthor,author,shorteditor,editor}
}

Change required by style:

\DeclareLabelScheme{labelname}{collection}{
\item{} % disable 'labelname' for @collection entries
}

Of course we could push this further, allowing some kind of format
specification, truncation limits, etc., but simply making the order
of the fields considered when generating 'labelname' (and
conceptually related fields) configurable would go a long way. Note
that 'labelname' also affects the 'labelyear' field,
the 'singletitle' option, etc., so it can't be done exclusively on
the Latex side.

Philipp Lehman

unread,

Oct 23, 2008, 10:51:27 AM10/23/08

to

Gerd Neugebauer wrote:
> If you are interested in ExBib I am willing to cooperate to make it
> more useful and more widely used.

I'm generally interested (after biblatex 1.0 is out, that is) but
we'll have to investigate if, where, and how cooperation makes sense.

> Beside I have started writing a compiler from this nasty BST
> language to Groovy. This is also a step towards a more modern
> language in this context.

Please correct me if I'm mistaken: this sounds like ExBib replaces
BibTeX in such a way that BST files are coded in Groovy. The overall
architecture would be similar to BibTeX. That is, you're writing a
more or less ready-to-use BBL file which is read by Latex. Citations
are not dealt with by ExBib. Is that correct?

> For ExBib i have experimented with different input languages. For
> backward compatibility my focus is on the classical BibTeX type of
> files. Nevertheless I also have worked on an XML and on a lisp
> parser (and prettyprinter). Thus playing around with the syntax
> should not be a big deal.

It's not just the syntax, it's the data model. To me, one of the major
motivations for dumping Bibtex would be adopting a new model for
bibliographic data (I'm not saying that it would be a big deal to
implement that. I'm merely stressing the difference between the
representation of the data -- BIB vs. XML vs. whatever -- and the
hierachy implied by the data model).

One of the problems I've run into while working on biblatex is that
you can't attach properties to list items (or attach a list to an
item in some other list). Consider this simple case: a book with two
publishers. The cover says:

Deutscher Taschenbuch-Verlag
München
Walter de Gruyter
Berlin · New York

All you can do in this case is to concatenate the publisher/location
data -- which means that you're effectively discarding some of the
information at hand:

publisher = {DTV and Walter de Gruyter},
location = {M{\"u}nchen and Berlin and New York},

We'd need something like this (written in pseudo-BIB syntax):

publisher = {
name = {DTV},
location = {München},
},
publisher = {
name = {Walter de Gruyter},
location = {Berlin},
location = {New York},
},

What's more, there are localization issues even on the data level:

publisher = {
name = {DTV},
location[german] = {München},
location[english] = {Munich},
...
},

That's why I believe that the traditional BIB files are a rather
hopeless case. Mind you, the example above is a very basic thing. If
you look at something like MODS[1], you'll realize just how hopeless
the case of the traditional BIB format is (again, not because of its
syntax but because of the data model).

[1] http://www.loc.gov/standards/mods/

Joseph Wright

unread,

Oct 23, 2008, 11:10:15 AM10/23/08

to

On Oct 23, 3:51 pm, Philipp Lehman

>
> That's why I believe that the traditional BIB files are a rather
> hopeless case. Mind you, the example above is a very basic thing. If
> you look at something like MODS[1], you'll realize just how hopeless
> the case of the traditional BIB format is (again, not because of its
> syntax but because of the data model).
>
> [1]http://www.loc.gov/standards/mods/

In general I think I see your point, but there is a lot of data about
publications that is not needed for a citation. I'm not sure a file
format suitable for a national library wanting to store all
information about a publication is necessarily what is needed by most
current LaTeX/BibTeX users. I'd imagine that a lot of people would be
wary of a format that did not stay broadly human readable and editable
using a text editor. As it is, XML is a pain to read in something
like Notepad (I find the .bib format rather clearer). So the
challenge is to provide enough information for the job in hand without
making it unattractive to the user base.
--
Joseph Wright

Joseph Wright

unread,

Oct 23, 2008, 11:13:44 AM10/23/08

to

On Oct 23, 4:10 pm, Joseph Wright <joseph.wri...@morningstar2.co.uk>
wrote:

> In general I think I see your point, but there is a lot of data about
> publications that is not needed for a citation. I'm not sure a file
> format suitable for a national library wanting to store all
> information about a publication is necessarily what is needed by most
> current LaTeX/BibTeX users. I'd imagine that a lot of people would be
> wary of a format that did not stay broadly human readable and editable
> using a text editor. As it is, XML is a pain to read in something
> like Notepad (I find the .bib format rather clearer). So the
> challenge is to provide enough information for the job in hand without
> making it unattractive to the user base.

I should add that I suspect that there will always be a non-trivial
number of cases that can only be handled by hand. Some items simply
do not fit into any neat category, and the differing conventions of
different publishers make some hand-editing inevitable.
--
Joseph Wright

Philipp Lehman

unread,

Oct 23, 2008, 11:55:45 AM10/23/08

to

Joseph Wright wrote:

>> [1]http://www.loc.gov/standards/mods/
>
> In general I think I see your point, but there is a lot of data
> about publications that is not needed for a citation. I'm not sure
> a file format suitable for a national library wanting to store all
> information about a publication is necessarily what is needed by
> most current LaTeX/BibTeX users. I'd imagine that a lot of people
> would be wary of a format that did not stay broadly human readable
> and editable using a text editor. As it is, XML is a pain to read
> in something like Notepad (I find the .bib format rather clearer).
> So the challenge is to provide enough information for the job in
> hand without making it unattractive to the user base.

MODS or anything comparable in complexity would be a long-term goal
and we'd only support a small subset. As to editing the files, the
point of XML is that its easier to adopt by graphical frontends.

Anyway, I'm not saying that MODS is the way to go. I think that, if we
move to a completely new data model, it shouldn't be a homebrew
thing.

Joseph Wright

unread,

Oct 23, 2008, 12:11:52 PM10/23/08

to

On Oct 23, 4:55 pm, Philipp Lehman <devnull.1.leh...@spamgourmet.com>
wrote:

> MODS or anything comparable in complexity would be a long-term goal

> and we'd only support a small subset.

I'd assumed you meant that, but just thought it was worth mentioning.

> As to editing the files, the
> point of XML is that its easier to adopt by graphical frontends.

Yes, but LaTeX users don't tend to want graphical tools per se. I use
JabRef for my .bib files, but I still like to know that I can run them
through a text editor. I'd imagine the Emacs users amongst us
*really* like the plain-text nature of the .bib format.

> Anyway, I'm not saying that MODS is the way to go. I think that, if we
> move to a completely new data model, it shouldn't be a homebrew
> thing.

I'm guessing that there are a limited number of non-homebrew solutions
that have wide support. Looking at the commercial world, EndNote et
al. use a similar underlying idea to BibTeX (the idea of properties
doesn't come up there either). So I assume that one of the models
suggested here would have to be picked.
--
Joseph Wright

Gerd Neugebauer

unread,

Oct 23, 2008, 3:55:26 PM10/23/08

to

Philipp Lehman wrote:
> Gerd Neugebauer wrote:

>> Beside I have started writing a compiler from this nasty BST
>> language to Groovy. This is also a step towards a more modern
>> language in this context.
>
> Please correct me if I'm mistaken: this sounds like ExBib replaces
> BibTeX in such a way that BST files are coded in Groovy. The overall
> architecture would be similar to BibTeX. That is, you're writing a
> more or less ready-to-use BBL file which is read by Latex.

For the Groovy part you are right. You are writing the the style in
Groovy instead of the BST language. This is not my main idea. I just
wanted a prrof of concept for the extensibility with an arbitrary
(supported by BSF) scripting language.

I have started the ExBib experiment because I want to investigate how to
interface ExTeX with "external" functionality. This means I am aiming at
a situation where the TeX, xindy, ExBiB, etc functionalities are not
seperated and interact via a file interface but are tightly integrated
and run in the same process.

> Citations
> are not dealt with by ExBib. Is that correct?

I am not sure what you mean with citations.
Currently ExBib does the same as BibTeX when used from the command line:
It reads one or more aux files and scans for \citation, \bibstyle, and
\bibdata (and \biboptions) and extracts the citations found in \citation
from the databases found in \bibdata.

>> For ExBib i have experimented with different input languages. For
>> backward compatibility my focus is on the classical BibTeX type of
>> files. Nevertheless I also have worked on an XML and on a lisp
>> parser (and prettyprinter). Thus playing around with the syntax
>> should not be a big deal.
>
> It's not just the syntax, it's the data model. To me, one of the major
> motivations for dumping Bibtex would be adopting a new model for
> bibliographic data (I'm not saying that it would be a big deal to
> implement that. I'm merely stressing the difference between the
> representation of the data -- BIB vs. XML vs. whatever -- and the
> hierachy implied by the data model).

I have also struggled with the data model. But I have come to the
conclusion that the only thing you see outside is the syntax. The BibTeX
syntax is rather general. There are only a few limitations:

- The key of an entry is unique (and not empty)
- The key of an item is unique

Once I thought that it might be a good idea to have a tree-like
structure in bibliographic entries. Now I am nearly convinced that this
is also just syntactic sugar which can be implemented with the crossref
feature.

> One of the problems I've run into while working on biblatex is that
> you can't attach properties to list items (or attach a list to an
> item in some other list). Consider this simple case: a book with two
> publishers. The cover says:
>
> Deutscher Taschenbuch-Verlag
> München
> Walter de Gruyter
> Berlin · New York
>
> All you can do in this case is to concatenate the publisher/location
> data -- which means that you're effectively discarding some of the
> information at hand:
>
> publisher = {DTV and Walter de Gruyter},
> location = {M{\"u}nchen and Berlin and New York},
>
> We'd need something like this (written in pseudo-BIB syntax):
>
> publisher = {
> name = {DTV},
> location = {München},
> },
> publisher = {
> name = {Walter de Gruyter},
> location = {Berlin},
> location = {New York},
> },

I agree that this would be nice to express. Nevertheless I think this
can already be expressed (rather ugly) with the current BibTeX syntax. I
will not show it since I am convinced that the syntactic sugar you have
shown is nice an I can not compete.

One problem is that you need a sytle to cope with this information.

(Another problem is that users are already overwelmed by the features of
BibTeX, not to speak of any extension)

> What's more, there are localization issues even on the data level:
>
> publisher = {
> name = {DTV},
> location[german] = {München},
> location[english] = {Munich},
> ...
> },

Again syntactic sugar:

publisher = {
name = {DTV},

location.german = {München},
location.english = {Munich},
...
},

This is legal BibTeX syntax and you just need a proper bst to process it.

> That's why I believe that the traditional BIB files are a rather
> hopeless case. Mind you, the example above is a very basic thing. If
> you look at something like MODS[1], you'll realize just how hopeless
> the case of the traditional BIB format is (again, not because of its
> syntax but because of the data model).
>
> [1] http://www.loc.gov/standards/mods/

I agree that the model behind BibTeX is rather simple minded. But it is
not restricted to bibliographies. I have used it Adress lists and other
things as well.

And again. All you see from the outside is the syntax. Let's agree on a
syntax and leave it to the internals of the program how to deal with it...

Ciao
Gerd

Jellby

unread,

Nov 1, 2008, 8:00:16 AM11/1/08

to

Among other things, Charles de Miramon saw fit to write:

> Since I have started using biblatex, I was wondering how difficult it
> would be to code a quick replacement of bibtex for biblatex in Python or
> any other language.

I actually did something like that some time ago.

My first goal was to have a better control on how the different parts of a
name (first, last, von, junior) are recognized an abbreviated. An of
course, to have Unicode support.

So I started writing a Perl tool that would just replace bibtex when used
with biblatex, i.e., it would read the .aux and .bib files and generate
the .bbl file needed by biblatex. I managed to get it to a point where I
could use it for my own needs, that means it did not support all the
biblatex options, specifically the unique name generation and sorting (I
use unsorted numerical labels).

I haven't touched any more of it since, but I'd be willing to send it to
anyone interested (I did send it to Philip Lehman back then).

--
Ignacio __ Fernández Galván
/ /\
Linux user / / \ PGP Pub Key
#289967 / / /\ \ 0x01A95F99
/ / /\ \ \
http://djelibeibi.unex.es
/________\ \ \
jellby \___________\/ yahoo.com

PK

unread,

Mar 29, 2011, 5:21:51 AM3/29/11

to

On Tuesday, October 21, 2008 8:07:05 PM UTC+2, phil...@gmail.com wrote:

> Right, I think I now appreciate more where the division of labour is
> between BiBTeX and Biblatex which means that your previous post about
> a BiBTeX replacement is really the issue.

Replying to myself over two years later in order to close this thread
for anyone else researching this. As a result of this issue, I became
involved in the development of biber and this week biblatex 1.4/biber
0.9 will be released which implements the functionality discussed in
this thread. To recap one of the original examples:

The uniquename option now disambiguates all names globally in a ref
section. So, with uniquename=init and labelyear=true:

@BOOK{2names,
AUTHOR = {Alfred Armstrong and Bertie Butterford},

TITLE = {Title 1},
YEAR = {2002}
}

@BOOK{2names2,
AUTHOR = {Peter Armstrong and Bertie Butterford},

TITLE = {Title 2},
YEAR = {2003}
}

@BOOK{2names3,

AUTHOR = {Peter Armstrong and Bertie Butterford},

TITLE = {Title 3},
YEAR = {2003}
}

You will get the citations:

A. Armstrong and Butterford (2002)

P. Armstrong and Butterford (2002a)
P. Armstrong and Butterford (2002b)

That is, ambiguous names are now disambiguated by initials/full names
and ambiguous years by the labelyear mechanism.

To take the other example from the original problem:

@BOOK{6names,
AUTHOR = {Alan Armstrong and Bertie Butterford and Cecil Cuthbertson and Dickie Dartmouth and Eric Eroldson and Freddie Foundlake},
TITLE = {Test 1},
YEAR = {2006}
}

@BOOK{6names2,
AUTHOR = {Alfred Armstrong and Graham Grimstead and Bertie Butterford and Cecil Cuthbertson and Dickie Dartmouth and Eric Eroldson},

TITLE = {Test 2},

YEAR = {2006}
}

Using the option uniquelist=true,uniquename=false will result in:

Armstrong, Butterford, et al. (2006)
Armstrong, Grimstead, et al. (2006)

That is, the name lists are automatically disambiguated to the point of unambiguity.

These two new behaviours can be arbitrarily combined. For example, the previous example with

uniquelist=true, uniquename=full:

Alan Armstrong, et al. (2006)
Alfred Armstrong, et al. (2006)