Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

biblatex-apa - work underway

161 views
Skip to first unread message

phil...@gmail.com

unread,
Oct 22, 2008, 11:22:44 AM10/22/08
to
I am working on a fairly detailed APA style for biblatex and have
completed a decent draft of the citations style. I'm currently working
on the references style. Now I realise why I couldn't find such a
style when I looked for one - it's a bit of a beast. At least the APA
style guide is very explicit about the requirements. The problem is
that some of the requirements are beyond the current capabilities of
BibTeX and therefore biblatex. However, these are not huge issues
currently and will probably eventually be fixed when hopefully either
BibTeX 1.0 adds new features or, more likely, it's replaced by a more
modern format/language.

So, soonish I will have a 0.1 release of this style if anyone would
like to try it out. The style files are documented with references to
the APA style manual section numbers so you can tell which
requirements each bit of code is implementing.

Joseph Wright

unread,
Oct 22, 2008, 2:39:43 PM10/22/08
to

From experience writing some chemistry styles for BibTeX and biblatex,
I think that you are lucky to have detailed rules. I have "do this
for journal articles, then, erm, make it up" as a guide!

I'd point out that I suspect any automated system will struggle under
some circumstances. There are always some difficult-to-handle edge
cases.

Good luck: I get the impression the APA style is picky.
--
Joseph Wright

phil...@gmail.com

unread,
Oct 22, 2008, 3:01:10 PM10/22/08
to
On Oct 22, 8:39 pm, Joseph Wright <joseph.wri...@morningstar2.co.uk>
wrote:

> I'd point out that I suspect any automated system will struggle under
> some circumstances.  There are always some difficult-to-handle edge
> cases.
>
> Good luck: I get the impression the APA style is picky.

It is very picky but thankfully extremely explicit about everything.
The tricky (currently impossible) bits to automate are actually pretty
central and important in general for many styles that need to
disambiguate truncated author lists ("et al" truncations). See the
other thread on this group currently with title:

Biblatex: Uniquely truncating author name lists?

The issue is pretty fundamental and requires a BiBTex upgrade or more
likely, complete replacement.

PK

Joseph Wright

unread,
Oct 22, 2008, 4:10:36 PM10/22/08
to
On Oct 22, 8:01 pm, philk...@gmail.com wrote:
>
> > Good luck: I get the impression the APA style is picky.
>
> It is very picky but thankfully extremely explicit about everything.
> The tricky (currently impossible) bits to automate are actually pretty
> central and important in general for many styles that need to
> disambiguate truncated author lists ("et al" truncations). See the
> other thread on this group currently with title:
>
> Biblatex: Uniquely truncating author name lists?
>
> The issue is pretty fundamental and requires a BiBTex upgrade or more
> likely, complete replacement.

I've been following that with interest. I've asked before here about
BibTeX replacements. There are lots of ideas, but I think that
something that is evolution not revolution is really needed (for
example, simply adding UTF8 support to bibtex8 would be very useful,
although not necessarily easy). Perhaps LuaTeX will save the day
(although I'm not sure that a specialist tool isn't a better idea).
--
Joseph Wright

Simon Spiegel

unread,
Oct 22, 2008, 5:51:56 PM10/22/08
to
On 2008-10-22 16:10:36 -0400, Joseph Wright
<joseph...@morningstar2.co.uk> said:

(Sorry if this turns out to be a double post. I write a similar post
earlier the day, but I think it was lost).

The great thing about the BibTeX format is that it's extremely simple,
the bad thing about the BibTeX format is that it's extremely simple. ;)

While probably everyone agrees that some kind of BibTeX successor is
needed I think it would be a waste of energy to try to come up with
something completely new just for the LaTeX world. One of the great
things about the BibTeX format is that while it has its roots in the
LaTeX world it has been widely used as an exchange format because it is
so simple to write. I think if someone would come with some
hypothetical BibTeX successor which is mainly geared at LaTeX that it
would be doomed to failure because LaTeX just isn't that widespread
anymore. I think for a new data format or model it's very important to
look what happens outside of the LaTeX universe. If we have a new
format no one supports, it wont help us a lot. And there is much going
here, especially stuff related to Zotero (http://www.zotero.org ).

There is, one hand, CSL (Citation Style Language)
(http://xbiblio.sourceforge.net/ ), which, as far as I understand, has
similar goals like biblatex: offering a metalanguage for complex
citation styles. CSL certainly could be implemented for LaTeX which
would mean that all CSL styles which already exist could be used for
LaTeX. This would probably leave biblatex out completely, but it would
have the advantage of a well documented and powerful language which is
also used outside of the LaTeX world.

Bruce D'Arcus who is the driving force behind CSL, is also working on
the Bibliographic Ontology Specification (http://bibliontology.com/ )
which offers a data model for bibliographies. I don't know how useful
this already is in practice, but since this should work with both,
Zotero and OpenDocument and is quite comprehensive, it certainly seems
like something to consider for potential BibTeX successor. I don't
really understand the technical side of this, but I think this is also
something which could be made to work with biblatex.

It would be interesting to know what Philipp thinks about those two projects.

simon


phil...@gmail.com

unread,
Oct 22, 2008, 6:18:15 PM10/22/08
to
On Oct 22, 11:51 pm, Simon Spiegel <si...@simifilm.ch> wrote:

> It would be interesting to know what Philipp thinks about those two projects.

That really is the crucial thing here, yes ...

I can't see that these projects would replace biblatex since biblatex
is really about formatting and typesetting and not data models etc.
per se. It needs a data backened but the formatting is really the
essential issue and that's completely LaTeX specific. There is no
doubt that BibTeX's data model and file format is just ripe for being
replaced by an XML format. The .bst language is also well past its
prime and needs replacing. The jobs that need doing here are mainly
string manipulation and that's one area where modern languages have
developed very well.

Simon Spiegel

unread,
Oct 22, 2008, 7:54:16 PM10/22/08
to

You really have to differentiate between the two projects. CSL
basically tries to cover the same area as biblatex: It offers a
language for formatting the data. I can't comment on how the two
compare (I imagine biblatex offers much more finegrained control, but
that's just a guess) but they're both about how the data is presented.
The Bibliographic Ontology, on the other hand, is not concerned about
the formatting but about a data model. Again, I can't comment on how
actually performs, but I know that Bruce is very much concerned about
providing a model which can cover even esoteric instances.

Anyway, what I was trying to get at: I certainly don't think it would
be wise to come up with some kind of new data format just for the LaTeX
world. Whether it's Bibliographic Ontology or something completely
different, I think it's really important to use some kind of wider
established standard.

simon

phil...@gmail.com

unread,
Oct 23, 2008, 7:48:39 AM10/23/08
to
On Oct 23, 1:54 am, Simon Spiegel <si...@simifilm.ch> wrote:

> You really have to differentiate between the two projects. CSL
> basically tries to cover the same area as biblatex: It offers a
> language for formatting the data.

I'm not sure that this is quite right - CSL says it is essentially a
replacement for the .bst BibTeX format and Biblatex is mainly a whole
load of LaTeX-specific stuff built on top of that. It looks like CSL
could replace both the BibTeX .bib format and the .bst formatting
language but complication is the following:

Normally, BibTeX outputs a set of commands which LaTeX understands and
can include directly in a file. There is nothing to stop other
programs using the BiBTeX output and interpreting as they see fit,
which has been done.
However, BibLaTeX has a very odd .bst file which results in what is
essentially a flat-file DB from BibTeX - no formatting commands at
all. This is all (mostly) handled by BibLaTeX's completely LaTeX
specific styles. I think most would agree that this is great for LaTeX
as you get really fine grained and extensible control over the layout
of citations and references, just like you do with formatting in
general. This is a big move forward for LaTeX as it brings
bibliographies much more into the LaTeX realm. It doesn't bring it all
the way in because of current BiBTeX limitations.

There is no doubt that a replacement for the data-transformation
language (.bst stuff) is needed as that's the main limitation right
now. This almost certainly needs a new data language and data model
(.bib file stuff). But almost certainly not a new style language as
biblatex is by far the best one availale for LaTeX users. Once the
data has been converted from the data language into some neutral
format (probably XML based), it's in the hands of the style engine
which glues it into LaTeX and that's where biblatex is necessary.

So, we have this sort of arrangement I think:

Bibliography format (.bib files, XML files, relational DB etc.) ->
Data model implementation language (.bst, CSL etc.) ->
Style engine (BibTeX, biblatex etc.) ->
Formatting commands (LaTeX etc)

Biblatex is unlikely to go away as it's now, I think, the best choice
for stage three above (for LaTeX users anyway). The first two stages
are what is in question here as they are limiting biblatex in various
ways. The confusion is that BibTeX is a file format, a data model and
a processor. biblatex doesn't rely (or want to rely anyway) on all
aspects of BibTeX - it wants to do the formatting aspect itself. So
the real issue with biblatex is that it wants to take over the
formatting aspect of the .bst file but still use an external data
model and file format. With BibTeX, you can't seperate these cleanly
hence the need to replace BiBTeX which something which allows this.

Philipp Lehman

unread,
Oct 23, 2008, 10:58:57 AM10/23/08
to
phil...@gmail.com wrote:

> So, we have this sort of arrangement I think:
>
> Bibliography format (.bib files, XML files, relational DB etc.) ->
> Data model implementation language (.bst, CSL etc.) ->
> Style engine (BibTeX, biblatex etc.) ->
> Formatting commands (LaTeX etc)

That sums it up very well. From the perspective of biblatex, I'd add
two details, [2b] and [5]:

[1] Bibliography format (BibTeX/biblatex.bst) ->
[2a] Data model implementation language (BibTeX) ->
|-> [2b] Style engine data interface (biblatex.bst) ->
|-> [3] Style engine (biblatex.sty) ->
| [4] Formatting commands (biblatex.sty)
|-- [5] User interface (biblatex.sty)

So let's look at alternatives. Gerd Neugebauer mentioned ExBib. If my
interpretation of his description is correct, moving to ExBib would
lead to the following workflow:

[1] Bibliography format (ExBib, based on BibTeX) ->
|-> [2a] Data model implementation language (ExBib/Java) ->
|-> [2b] Style engine data interface (ExBib/Groovy) ->
| [3] Style engine (biblatex.sty) ->
| [4] Formatting commands (biblatex.sty)
|-- [5] User interface (biblatex.sty)

Essentially, ExBib would replace BibTeX and biblatex.bst would be
replaced by a special ExBib style written in Groovy. This would buy us
a more modern 'layer [2]' tool which handles Unicode, allows more
communication between [3]-[5] and [2a]/[2b], and is probably easier
to code on the [2b] layer (I don't grok Groovy, though...).

In theory, it may even be possible to use CiteProc to do the same job:

[1] Bibliography format (MODS) ->
[2a] Data model implementation language (CiteProc/XLST) ->
[2b] Style engine data interface (CiteProc/CSL) ->
[3] Style engine (biblatex.sty) ->
[4] Formatting commands (biblatex.sty)
[5] User interface (biblatex.sty)

This would effectively port the initial shift introduced by
biblatex -- i.e., use an application (BibTeX) and a scripting
language (BST) originally designed to output printable data -- to a
new application (CiteProc) and a new language (CSL). This adds quite a
bit of bloat to the workflow because most of what CSL is designed for
is done in TeX if you use biblatex (and using TeX on this layer is
obviously the whole point).

Personally, I'd prefer a leaner solution which combines [2a] and [2b].
Let's assume we settle with a MODS-based solution for the data and
use a Perl script as a database frontend:

[1] Bibliography format (XML/MODS) ->
|-> [2] Database frontend + data interface (Perl script) ->
|-- [3] Style engine/formatting/interface (biblatex.sty)

This is more effective and much easier to implement. It's also in line
with the philosophy behind biblatex. After all, the original idea
was: citations and bibliographies are about formatting data in a
certain way. TeX is good at typesetting, LaTeX is used anyway, LaTeX
users know LaTeX, so why not do it in TeX? Problems: sorting and
string manipulation can't be done in TeX in a robust and efficient
way, so some sort of mediation between the raw data and the style
engine is required. This layer is currently provided by biblatex.bst
which in turn uses BibTeX.

However, there is no benefit at all in introducing a new language
(BST) to get the job done. The reasons for this are
purely historical (building on existing tools to get started
quicker) and the intermediate layer turned out to be a major
bottleneck.

To sum that up, bringing biblatex to the next level would require
three closely related but distinct things:

1) a comprehensive model for raw bibliographic data
2) a format for #1
3) a database frontend for #2

Getting back to some of the points brought up by Simon, I think it is
highly desireable to look at existing standards for the raw
bibliographic data. Alas, there is no such thing as an accepted
standard for bibliographic data! Library is different countries (and
possibly even in the same country) seem to do their own
thing. I believe that this issue can't be solved by bibliographic
tools in a satisfying way. It must be addressed on an entirely
different level. The interesting thing about MODS is that it's
located on the right level. A national library is the first
institution involved in the process of standardizing raw
bibliographic data. I've never looked at MODS in great detail but I
agree that it's an interesting and highly relevant development.

Whether building on existing tools like CiteProc+CSL or ExBib makes
sense is a difficult question. It would imply that we have the same
somewhat convoluted division of labor we have with
BibTeX/biblatex.bst since these tools are designed to produce
printable output. The nice thing about biblatex is that it's tightly
integrated with the typesetting process. It only requires a
specialized external application which does all the things you can't
do in TeX in a satisfying way.

On a side note, I wonder how CiteProc/CSL would handle rules such
as "replace a repeated citation with 'ibidem' unless it is the first
citation on the current page". Citations are not necessarily fixed
strings. Their format may depend on their position in the flow of
text.

--
Sender address blackholed; do not reply to From: address.
You can still reach me by email at: plehman gmx net.

phil...@gmail.com

unread,
Oct 23, 2008, 6:24:13 PM10/23/08
to
On Oct 23, 4:58 pm, Philipp Lehman <devnull.1.leh...@spamgourmet.com>
wrote:

>     [1] Bibliography format (XML/MODS) ->
> |-> [2] Database frontend + data interface (Perl script) ->
> |-- [3] Style engine/formatting/interface (biblatex.sty)

Right, I completely agree - biblatex is completely fine from the Style
Engine layer up and any solution involving a new tool which also has
functionality at levels 3 and above is:

1) Almost certainly going to be a pain because it would have to be
circumvented and crippled on purpose, just like the full functionality
of BibTeX currently already is in biblatex.
2) Would be a waste of the development time in such a tool since, by
point 1, a lot of its functionality would be ignored. This would be
like using perl only as a sed replacement.

CSL and ExBiB seem to want to do parts of what biblatex already
prevents BibTeX from doing so look doubtful as clean solutions to me.
All that's needed is data in a well-specified format feeding to a
style engine, just like the XML/XSL boundary. Perl/XML would be ideal
but then, I like perl. I looked at the biber code on sourceforge but
it seems not to have been touched for quite a while and it isn't very
easy to read as it's not OO. I think that what's needed, no more and
no less is:

1) Model and an instantiation schema for the model. That is, an XML
schema which is basically both.
2) Parser for tokens of the schema type.
3) Interface language definition (.bbl?)
4) Program to implement the parser and output in the interface
language.

The main question which comes to mind is:

1) Do you still want to use .bbl format files as the interface
language? That is, how much of the problems with BibTeX are in the
data model, how much in the .bst language and how much in the data
transition format (.bbl)? I'm assuming it would mean a fairly major
biblatex.sty rewrite to change from .bbl to something else?

PK

donate...@gmail.com

unread,
Oct 23, 2008, 9:18:42 PM10/23/08
to
On Oct 23, 7:58 am, Philipp Lehman <devnull.1.leh...@spamgourmet.com>
wrote:

I wonder if you might misunderstand CSL and CiteProc? CiteProc is a
bit of a "meta" project & there are multiple implementations, each
with differing feature sets. The idea is to be able to take a diverse
set of bibliographic metadata (".bib files, XML files, relational DB
etc."), CSL style files, and a formatted document and to output a
formatted document with citations embedded.

A hypothetical implementation of CiteProc could a .tex file, a .bib
file, and a .csl file & output a .tex file. There is no need for
BibTeX or BibLaTeX in such a toolchain. The implementation that is
closest to reaching this is Haskell implementation with pandoc. The
input document format is the lightweight markup language 'markdown' &
the bibliographic metadata is MODS XML. The output could be any
language supported by pandoc, include TeX.

One could easily see this or a different implementation of CiteProc
being expanded to take .tex and .bib files as input.

In short: the workflow is not "bloated" by CSL, CSL does not seem to
do any work that TeX does, and you can already use LaTeX to typeset
documents that are used in a CiteProc/CSL toolchain (to obtain the
beautiful documents that were the point of Bib(La)TeX).

> On a side note, I wonder how CiteProc/CSL would handle rules such
> as "replace a repeated citation with 'ibidem' unless it is the first
> citation on the current page".

CSL does account for 'ibid' and the javascript implementation of
CiteProc used by Zotero handles it well. Refer to the Bluebook and
Chicago CSL styles in the Zotero repository for examples. Many of
those in the humanities are cursed by having both more complicated
bibliographic metadata (hence MODS) and more complicated styles (hence
CSL) than many of us in math/science/engineering that make up the
majority of BibTeX/BibLaTeX/LaTeX/TeX userbase.

donate...@gmail.com

unread,
Oct 23, 2008, 10:03:04 PM10/23/08
to
On Oct 23, 6:18 pm, donatetof...@gmail.com wrote:

> > On a side note, I wonder how CiteProc/CSL would handle rules such
> > as "replace a repeated citation with 'ibidem' unless it is the first
> > citation on the current page".
>
> CSL does account for 'ibid'

For clarity: CSL is document-format agnostic. Many document formats
(HTML, markdown, etc.) do not have well-defined page boundaries &
WYSIWIG word processors are crummy at deducing page boundaries. I
don't think any current citeproc implementation would work exactly as
you describe--ibid is applied to the document-level, rather than the
page-level. There don't seem to be any inherent limitations that
would prevent a new or improved CiteProc from handling it, though.

phil...@gmail.com

unread,
Oct 24, 2008, 5:48:57 AM10/24/08
to
On Oct 24, 3:18 am, donatetof...@gmail.com wrote:

> A hypothetical implementation of CiteProc could a .tex file, a .bib
> file, and a .csl file & output a .tex file. There is no need for
> BibTeX or BibLaTeX in such a toolchain.

I think this is the bit that I have difficultly with - you can't
remove biblatex from the chain without removing all the LaTeX
functionality for typesetting. CSL could only deal with TeX/LaTeX
syntactically: since it isn't actually running inside a TeX compiler,
it's just strings, which is what XSLT is good at transforming. The
power of biblatex for LaTeX users is precisely that it's not just
syntactic - since it's running inside TeX/LaTeX, it's much more
powerful than a syntactic transform for bibliographies. Of course,
with any semantic transform like this, it's completely non-general -
biblatex only works with LaTeX. But like all semantic transforms, it's
much more powerful.

Of course CiteProc could output TeX/LaTeX commands but that would be,
as PL says, to re-implement a *huge* amount biblatex work and I can't
see how it would deal with state and tracking variables like
\cbx@parens or whatever in biblatex. If the output of CiteProc is a
TeX/LaTeX file which is supposed to be the include file for the
bibliography, then it would have to be really complicated with a lot
of \defs in it to mimic the state which is held in memory by biblatex
as it processes things. Hence PL's question about idem and page-
tracking etc.

How is Citeproc going to handle citations? I can't see how that would
work at all if it didn't also try to process the document as well as
the bibliography. Might as well re-implement LaTeX as XML/XSLT. But
this is the wrong emphasis because XML/XSLT are about data transform,
not typesetting. What is being suggested of CiteProc here is like
saying that in converting a bib to HTML, you should also do the in-
browser layout and therefore partially replace Firefox. Firstly,
that's almost certainly too hard because it's not the right tool and
secondly, why bother replace a specialist tool that does a job really
well?

CiteProc looks really good at doing what it does but since there is no
such thing as a generic typsetting or presentation layer, some data
somewhere has to be passed to LaTeX to typeset and the whole point of
biblatex and parts of BibTeX is that this has to be tightly integrated
into the typesetter (TeX).

This is just a fact about typesetting which is not generally
understood - it's not all just string being shuffled about - there is
a lot of semantic meta-information required (look at all the biblatex
options - they are semantic meta-information options which *must* be
tightly bound to the typsetting engine). I defy anyone to write a
typesetting system of *any* complexity where every stage can be
semantically de-coupled from every other stage. It's theoretically
impossible. You can start with "neutral" data in MODS or whatever but
citations and bibliography on the page are saturated with semantic
distinctions and decisions. Somewhere between the data and the page,
these semantic considerations have to be introduced. biblatex does
that well. A completely "just pass a file with all the information in
it to the next stage" approach can't possibly do this as it's just a
special case of the old AI problem of trying to reduce all semantics
to syntax.

phil...@gmail.com

unread,
Oct 24, 2008, 5:49:55 AM10/24/08
to
On Oct 24, 3:18 am, donatetof...@gmail.com wrote:

> A hypothetical implementation of CiteProc could a .tex file, a .bib
> file, and a .csl file & output a .tex file. There is no need for
> BibTeX or BibLaTeX in such a toolchain.

I think this is the bit that I have difficultly with - you can't

Philipp Lehman

unread,
Oct 24, 2008, 6:13:26 AM10/24/08
to
donate...@gmail.com wrote:
> I wonder if you might misunderstand CSL and CiteProc? CiteProc is a
> bit of a "meta" project & there are multiple implementations, each
> with differing feature sets. The idea is to be able to take a
> diverse set of bibliographic metadata (".bib files, XML files,
> relational DB etc."), CSL style files, and a formatted document and
> to output a formatted document with citations embedded.

I may very well misunderstand CSL and CiteProc. I don't know much
about the project, after all.

> One could easily see this or a different implementation of CiteProc
> being expanded to take .tex and .bib files as input.

Sure, that's perfectly clear.

> In short: the workflow is not "bloated" by CSL, CSL does not seem to
> do any work that TeX does, and you can already use LaTeX to typeset
> documents that are used in a CiteProc/CSL toolchain

There seems to be a missunderstanding indeed. Let me try to clarify.
biblatex is a style engine written in TeX which runs 'inside' the
typesetter. It does the same thing as CiteProc/CSL to some extend. It
also needs an external support tool to do some things you can't do in
TeX. Currently, it 'abuses' BibTeX for this very purpose. One of the
points of my previous post was: would it make sense to abuse CiteProc
instead? That's were the 'bloat' comes in.

I'm not saying that CiteProc/CSL is bloated. What I'm saying is that a
workflow which puts biblatex on top of a CiteProc/CSL toolchain adds
bloat. There is nothing wrong with CiteProc/CSL. The question is if
there may be something wrong with this particular workflow which
(ab)uses CiteProc/CSL to do a job is was not quite designed for.

I'm not judging CiteProc/CSL in any way. I think it's a very
impressive and highly ambitious project. biblatex is much more humble
in a lot of ways.

Philipp Lehman

unread,
Oct 24, 2008, 6:18:46 AM10/24/08
to
donate...@gmail.com wrote:
> On Oct 23, 6:18 pm, donatetof...@gmail.com wrote:
>> > On a side note, I wonder how CiteProc/CSL would handle rules such
>> > as "replace a repeated citation with 'ibidem' unless it is the
>> > first citation on the current page".
>>
>> CSL does account for 'ibid'
>
> For clarity: CSL is document-format agnostic. Many document
> formats (HTML, markdown, etc.) do not have well-defined page
> boundaries &
> WYSIWIG word processors are crummy at deducing page boundaries.

Sure, but there are a lot of citation styles in the humanities which
involve information such as page numbers or footnote numbers. Style
guides don't seem to care about crummy word processors (which is one
reason why citations are usually handled manually in the humanities).

Anyway, I was only bringing this up out of curiosity.

donate...@gmail.com

unread,
Oct 24, 2008, 12:08:55 PM10/24/08
to
> it's just strings, which is what XSLT is good at transforming.

You mention XSLT several times.

Again: CiteProc is more accurately a "meta" project. Only one
implementation of CSL is based on XSLT. The implementation used by
Zotero uses javascript to perform transformations and they use no xsl,
for example.


> How is Citeproc going to handle citations? I can't see how that would
> work at all if it didn't also try to process the document as well as
> the bibliography.

It already does process documents in the same way that BibTeX does--
the document is an input & the output both changes that text (adding
citations), as well as appending the bibliography.

Unlike the current implementation of BibLaTeX, CiteProc handles
sorting & so would not seem to need BibTeX at all in the toolchain.
It is a BibTeX replacement.

It is not, as both you and Phillip point out, a BibLaTeX replacement.
I agree that it would be useful to have LaTeX styles in the
toolchain. But those who are using BibTeX+BST files don't really have
that now.


> Might as well re-implement LaTeX as XML/XSLT.

If anything, I'm arguing to re-implement CiteProc in LaTeX. However,
as you point out, some things are hard to do in LaTeX. So you mirror
the BibTeX+BibLaTeX relationship: make a stand-alone citeproc
executable (replacing bibtex) that handles XML parsing, text
transformation, and the other things that are "hard" to do in pure
LaTeX & make a LaTeX style that plays nicely with this stand-alone
executable.


> But
> this is the wrong emphasis because XML/XSLT are about data transform,
> not typesetting.

Again: neither CiteProc nor BibTeX do any sort of real typesetting
(nor should they). BibLaTeX does give instructions to the typesetter,
which seem useful.

donate...@gmail.com

unread,
Oct 24, 2008, 12:26:30 PM10/24/08
to
> One of the
> points of my previous post was: would it make sense to abuse CiteProc
> instead? That's were the 'bloat' comes in.

Thanks for clarifying, but I fail to see where there is bloat added,
assuming that you want to use CSL styles + alternative sources of
bibliographic metadata (MODS XML, etc.).

Your previous suggestion:


> [1] Bibliography format (XML/MODS) ->
> |-> [2] Database frontend + data interface (Perl script) ->
> |-- [3] Style engine/formatting/interface (biblatex.sty)

ignores the ability to leverage the 1,000+ existing style files in an
existing format (CSL) that motivated this discussion in the first
place. But, perhaps you don't want to handle CSL.

If you don't mind having a citation styling format that is not used
outside of LaTeX (BST, LaTeX templates, or some new format), then I
agree that CiteProc+CSL don't really make sense & do add bloat.

However, I see value in adopting not only a richer, popular, existing
bibliography format (e.g. MODS XML), but also a styling language (e.g.
CSL).

Philipp Lehman

unread,
Oct 24, 2008, 12:53:47 PM10/24/08
to
donate...@gmail.com wrote:

>> One of the
>> points of my previous post was: would it make sense to abuse
>> CiteProc instead? That's were the 'bloat' comes in.
>
> Thanks for clarifying, but I fail to see where there is bloat added,
> assuming that you want to use CSL styles + alternative sources of
> bibliographic metadata (MODS XML, etc.).

I don't plan to use CSL styles. I'm not saying it wouldn't be useful
to have some sort of TeX interface to them but I'm not interested in
the job because biblatex operates on the same level.

Note that a biblatex style is a set of TeX macros. It's not an
abstract description of a style but the implementation of a style in
TeX. The style is implemented in the same language as biblatex
itself. You could say that biblatex is just a big toolbox of TeX
programming utilities.

> Your previous suggestion:
>> [1] Bibliography format (XML/MODS) ->
>> |-> [2] Database frontend + data interface (Perl script) ->
>> |-- [3] Style engine/formatting/interface (biblatex.sty)
>
> ignores the ability to leverage the 1,000+ existing style files in
> an existing format (CSL) that motivated this discussion in the first
> place. But, perhaps you don't want to handle CSL.

Indeed. I'm just looking for a tool which does some things you can't
do in TeX (which is not unusual in the LaTeX world; we also have
external index processors). That's all we're talking about in this
thread.

> If you don't mind having a citation styling format that is not used
> outside of LaTeX (BST, LaTeX templates, or some new format), then I
> agree that CiteProc+CSL don't really make sense & do add bloat.

That was my point. Mentioning CiteProc/CSL in this context was
obviously misleading.

> However, I see value in adopting not only a richer, popular,
> existing bibliography format (e.g. MODS XML), but also a styling
> language (e.g. CSL).

I mostly agree with the first part and I don't dispute the second but
adopting styling languages is not the point of biblatex. As I said,
it's essentially a programming toolbox for styles written in TeX. And
yes, we all know that this is a niche application.

donate...@gmail.com

unread,
Oct 24, 2008, 4:02:37 PM10/24/08
to
On Oct 24, 9:53 am, Philipp Lehman <devnull.1.leh...@spamgourmet.com>
wrote:

> I don't plan to use CSL styles.

Your reasoning is clear & your most recent post answers Simon's
inquiry directly.


> Note that a biblatex style is a set of TeX macros.

Yes, I understand & have used past versions of it. Simon seemed to
have asked if, in the future, we can't use styles that are not only
used by (some) LaTeX users. Your answer is now explicit: BibLaTeX's
audience is only LaTeX users, style files will continue to be written
using BibLaTeX, and there are no immediate plans to move to another
style language. It is out of BibLaTeX's scope. If, down the road,
BibLaTeX or another project finds themselves building non-LaTeX tools
to perform citation styling, Simon's query will again be relevant & we
should look outside of the TeX community for tools and formats that we
can reuse.


> > However, I see value in adopting not only a richer, popular,
> > existing bibliography format (e.g. MODS XML), but also a styling
> > language (e.g. CSL).
>
> I mostly agree with the first part

Regarding the first point: if you want a non-TeX tool to handle other
bibliography formats, then you should still look to pre-existing tools
to handle some of this (rather than merely writing new perl script, as
in your outline).

Some TeX users already use Chris Putnam's stand-alone 'bibutils' to
obtain BibTeX from disparate bibliographic formats. However, it can
be used as a library to another program and the MODS XML parsing in
'bibutils' is fairly good. Any project to expand the bibliographic
formats usable by BibLaTeX should look to this and similar programs as
a possible starting point.

phil...@gmail.com

unread,
Oct 26, 2008, 1:38:08 PM10/26/08
to
On Oct 24, 9:02 pm, donatetof...@gmail.com wrote:

> Some TeX users already use Chris Putnam's stand-alone 'bibutils' to
> obtain BibTeX from disparate bibliographic formats.  However, it can
> be used as a library to another program and the MODS XML parsing in
> 'bibutils' is fairly good.  Any project to expand the bibliographic
> formats usable by BibLaTeX should look to this and similar programs as
> a possible starting point.

I agree with this. It is all too common for people to look at bib data
and think "it's well defined and simple text, I'll just write a perl
script" which then begins weeks of pain and usually giving up. Bib
data is usually simple but interacts in really complex ways for
formatting so any stable libraries for first-stage parsing are a must.
Bibutils does look good. I'm still wondering what intermediate data
format Philipp is thinking of ... will it still be .bbl? I assume that
it would be a lot easier for biblatex to consume if it were.

Bruce

unread,
Nov 9, 2008, 11:26:17 PM11/9/08
to
Just for the archives, since I stumbled on it ...

On Oct 24, 5:13 am, Philipp Lehman <devnull.1.leh...@spamgourmet.com>
wrote:


> donatetof...@gmail.com wrote:
> > I wonder if you might misunderstand CSL and CiteProc?  CiteProc is a
> > bit of a "meta" project & there are multiple implementations, each
> > with differing feature sets.  The idea is to be able to take a
> > diverse set of bibliographic metadata (".bib files, XML files,
> > relational DB etc."), CSL style files, and a formatted document and
> > to output a formatted document with citations embedded.
>
> I may very well misunderstand CSL and CiteProc. I don't know much
> about the project, after all.

CSL is just an XML language to describe citation and bibliographic
formatting. Its purpose is to be as independent as possible from any
particular implementation details WRT to programming language, or
document or metadata formats.

The big advantage of this approach is that it should be much easier to
build up a large collection of styles (right now there are over 1000)
that can then be used in a variety of workflow contexts. For academics
that publish in a variety of different venues (most of which don't
support TeX), with a variety of different styles, this has the
potential to save a lot of labor and ensure much more flexibility.

The disadvantage vis-a-vis what is sounds like biblatex would do is
that it does not handle more complex features like tying ibid handling
to pages (though I suppose that could be added), or weird stuff like
cross-referencing of footnoted citations (though, again, in theory I
suppose it could be supported in CSL at least; would probably just be
difficult to code in implementations).

CiteProc is just an implementation of a CSL engine. The first one (now
outdated) I wrote using XSLT, but there are others in various states
of development for ruby, python, php, javascript, and haskell.

Bruce

Robin Fairbairns

unread,
Nov 10, 2008, 6:38:38 AM11/10/08
to
Bruce <bdarcu...@gmail.com> writes:
>Just for the archives, since I stumbled on it ...
>On Oct 24, 5:13=A0am, Philipp Lehman <devnull.1.leh...@spamgourmet.com>
>wrote:
>> donatetof...@gmail.com wrote:
>> > I wonder if you might misunderstand CSL and CiteProc? =A0CiteProc is a

>> > bit of a "meta" project & there are multiple implementations, each
>> > with differing feature sets. =A0The idea is to be able to take a

>> > diverse set of bibliographic metadata (".bib files, XML files,
>> > relational DB etc."), CSL style files, and a formatted document and
>> > to output a formatted document with citations embedded.
>>
>> I may very well misunderstand CSL and CiteProc. I don't know much
>> about the project, after all.
>
>CSL is just an XML language to describe citation and bibliographic
>formatting. Its purpose is to be as independent as possible from any
>particular implementation details WRT to programming language, or
>document or metadata formats.
>
>The big advantage of this approach is that it should be much easier to
>build up a large collection of styles (right now there are over 1000)
>that can then be used in a variety of workflow contexts. For academics
>that publish in a variety of different venues (most of which don't
>support TeX), with a variety of different styles, this has the
>potential to save a lot of labor and ensure much more flexibility.
>
>The disadvantage vis-a-vis what is sounds like biblatex would do is
>that it does not handle more complex features like tying ibid handling
>to pages (though I suppose that could be added), or weird stuff like
>cross-referencing of footnoted citations (though, again, in theory I
>suppose it could be supported in CSL at least; would probably just be
>difficult to code in implementations).

how would it do such a thing, independent of the typesetting engine?
previous discussions seemed to hover on the edge of claiming that csl
engines would provide typeset output; it's unclear to me how this
would be achieved, even considering such black-and-white distinctions
as word processor vs. markup language typesetting.

>CiteProc is just an implementation of a CSL engine. The first one (now
>outdated) I wrote using XSLT, but there are others in various states
>of development for ruby, python, php, javascript, and haskell.

yah, but ... for those of us whose comprehension doesn't stretch to
decoding an xml schema, but who want more than translation of the
acronym, is there anything on the net to help?

what you say about csl seems close to motherhood and apple pie, but
there are the odd things that worry me about it, and i see no way of
getting a clearer understanding.
--
Robin Fairbairns, Cambridge

phil...@gmail.com

unread,
Nov 10, 2008, 11:29:05 AM11/10/08
to
On Nov 10, 12:38 pm, r...@cl.cam.ac.uk (Robin Fairbairns) wrote:

> what you say about csl seems close to motherhood and apple pie, but
> there are the odd things that worry me about it, and i see no way of
> getting a clearer understanding.

The problem with this thread is that it has two streams going at once;
one is talking about enhancing a very specific interface between
bibliographic data and typesetting and the other is talking about
generic bibliographic systems with a smattering of formatting on top.

We have to keep clear here that biblatex reads some bibliographic
data, messes about with it and then typsets in intricate detail using
TeX/LaTeX. There is no way to "replace biblatex" without replacing the
typsetting system too. You can replace the bibliographic data level up
to the point where the typesetting starts and that's really all the
thread was about originally. I can't personally see any point in
discussing more than this since that would be a separate thread about
"why not use a completely different document preparation system
altogether" which might be interesting to talk about but it has
nothing to do with the original discussion about the BibTeX imposed
limitations of biblatex.

Specifically, there is no point worrying about whether CSL could do
ibidem stuff or footnote cite tracking - biblatex already does all
that and even if CSL did it, it wouldn't be integrated into the
typesetting system, which is must be if you're using TeX. The biblatex
author is only interested in replacing the data model and interface to
the data model. Anything more and the discussion is a more general one
about what system to use to do documents and bibliographies which
isn't a topic biblatex users are thinking about at all, I suspect.

Robin Fairbairns

unread,
Nov 10, 2008, 11:48:17 AM11/10/08
to
phil...@gmail.com writes:

>On Nov 10, 12:38=A0pm, r...@cl.cam.ac.uk (Robin Fairbairns) wrote:
>
>> what you say about csl seems close to motherhood and apple pie, but
>> there are the odd things that worry me about it, and i see no way of
>> getting a clearer understanding.
>
>The problem with this thread is that it has two streams going at once;
>one is talking about enhancing a very specific interface between
>bibliographic data and typesetting and the other is talking about
>generic bibliographic systems with a smattering of formatting on top.

i know that (i too have been reading the thread).

the specific question i asked was whether there was some description
of csl that i had missed (on the basis that i don't read xml schemas
as a means to learning about things' semantics ... i can get an xml
schema without problem, but it doesn't necessarily help).

i had vaguely hoped that a spec of the semantics (rather than merely
the syntax) of csl might be out there somewhere.

since you ducked my question, i'm guessing there isn't. perhaps one
has to be initiated into a magic circle (or something else arcane) to
get to learn about the format.

(bibtex is a bit like this. the language seems inaccessible to rather
a large proportion of the potential user community; this is a bore,
since it places a burden on those who do understand the language.
there is a program auto-generator, but it doesn't provide every
possible format one might like.)
--
Robin Fairbairns, Cambridge

Bruce

unread,
Nov 10, 2008, 12:51:30 PM11/10/08
to
On Nov 10, 6:38 am, r...@cl.cam.ac.uk (Robin Fairbairns) wrote:

I presume something like how Zotero integrates with OpenOffice or Word
*might* provide some hints? There, you have two basic pieces of code:

1) a script within the word-processor that collects and updates the
citations and bibliographic entries

2) a CSL processor, which reads a CSL style into some internal model,
and then takes the relevant metadata and processes the finished
citations and bibliographic entries for insertion in #1

>> CiteProc is just an implementation of a CSL engine. The first one (now
>> outdated) I wrote using XSLT, but there are others in various states
>> of development for ruby, python, php, javascript, and haskell.
>
> yah, but ... for those of us whose comprehension doesn't stretch to
> decoding an xml schema, but who want more than translation of the
> acronym, is there anything on the net to help?

There needs to be a spec, but there is not ATM. In the meantime, there
is some documentation hosted at the Zotero project that might be
helpful:

<http://dev.zotero.org/csl_syntax_summary>

Bruce

Simon Spiegel

unread,
Nov 11, 2008, 12:14:05 PM11/11/08
to
On 2008-11-10 17:48:17 +0100, rf...@cl.cam.ac.uk (Robin Fairbairns) said:

> phil...@gmail.com writes:
>> On Nov 10, 12:38=A0pm, r...@cl.cam.ac.uk (Robin Fairbairns) wrote:
>>
>>> what you say about csl seems close to motherhood and apple pie, but
>>> there are the odd things that worry me about it, and i see no way of
>>> getting a clearer understanding.
>>
>> The problem with this thread is that it has two streams going at once;
>> one is talking about enhancing a very specific interface between
>> bibliographic data and typesetting and the other is talking about
>> generic bibliographic systems with a smattering of formatting on top.
>
> i know that (i too have been reading the thread).
>
> the specific question i asked was whether there was some description
> of csl that i had missed (on the basis that i don't read xml schemas
> as a means to learning about things' semantics ... i can get an xml
> schema without problem, but it doesn't necessarily help).
>
> i had vaguely hoped that a spec of the semantics (rather than merely
> the syntax) of csl might be out there somewhere.
>
> since you ducked my question, i'm guessing there isn't. perhaps one
> has to be initiated into a magic circle (or something else arcane) to
> get to learn about the format.

Comparing the number of CSL styles that have been made public since the
release of Zotero and the number .bst styles published during the same
time, I'd estimate that it's not so difficult to enter the magic CSL
circle, at least much easier than learning .bst voodoo.


simon

Joseph Wright

unread,
Nov 11, 2008, 1:21:03 PM11/11/08
to
On Nov 11, 5:14 pm, Simon Spiegel <si...@remove.simifilm.ch> wrote:
>
> Comparing the number of CSL styles that have been made public since the
> release of Zotero and the number .bst styles published during the same
> time, I'd  estimate that it's not so difficult to enter the magic CSL
> circle, at least much easier than learning .bst voodoo.

From a biblatex point of view, the number of CSL styles is not really
important. After all, biblatex does the formatting (using TeX): it
just needs the data in a suitable format.
--
Joseph Wright

Robin Fairbairns

unread,
Nov 12, 2008, 11:19:47 AM11/12/08
to

there are an awful lot of bib styles on ctan, let alone any that
people keep hidden away. since i don't even know how long csl has
been around, i find it impossible to make the comparison. (people
tend to ask before making new bib styles, after all, so the rate of
generation of new ones is bound to tail off ... and the process has
been going since the 1980s.)

hand-waving arguments, therefore, don't get us anywhere. i'm
definitely not interested in working on csl styles -- what i want to
know is whether i should be pointing people at csl, in places like the
tex faq. so far, the evidence seems to be that even a passing mention
would be out of place.
--
Robin Fairbairns, Cambridge

Dan Luecking

unread,
Nov 12, 2008, 12:26:12 PM11/12/08
to
On 12 Nov 2008 16:19:47 GMT, rf...@cl.cam.ac.uk (Robin Fairbairns)
wrote:

> Simon Spiegel <si...@remove.simifilm.ch> writes:
>>On 2008-11-10 17:48:17 +0100, rf...@cl.cam.ac.uk (Robin Fairbairns) said:
>>>
>>> i had vaguely hoped that a spec of the semantics (rather than merely
>>> the syntax) of csl might be out there somewhere.
>>>
>>> since you ducked my question, i'm guessing there isn't. perhaps one
>>> has to be initiated into a magic circle (or something else arcane) to
>>> get to learn about the format.
>>
>>Comparing the number of CSL styles that have been made public since the
>>release of Zotero and the number .bst styles published during the same
>>time, I'd estimate that it's not so difficult to enter the magic CSL
>>circle, at least much easier than learning .bst voodoo.
>
>there are an awful lot of bib styles on ctan, let alone any that
>people keep hidden away.

It's hard to count the number of times someone has asked
here on c.t.t, "How do I get X.bst, but with such-and-such",
and been told to modify X.bst (renamed) with some code. There
must be hundreds. And of course makebst probably creates many
more than that. And generally, none of these get on any archive.

>hand-waving arguments, therefore, don't get us anywhere. i'm
>definitely not interested in working on csl styles -- what i want to
>know is whether i should be pointing people at csl, in places like the
>tex faq. so far, the evidence seems to be that even a passing mention
>would be out of place.

Not unless there is some easily accessible (preferably free)
way to produce output suitable for input to TeX or LaTeX.
Preferably from .bib files (via a conversion step if
necessary). I haven't read anyone alleging any of that, yet.


Dan
To reply by email, change LookInSig to luecking

phil...@gmail.com

unread,
Nov 12, 2008, 3:36:10 PM11/12/08
to
On Nov 10, 5:48 pm, r...@cl.cam.ac.uk (Robin Fairbairns) wrote:

> i know that (i too have been reading the thread).

Apologies - I realised I threaded the reply in response to your post
and not "Bruce"'s as intended ...

phil...@gmail.com

unread,
Nov 12, 2008, 3:51:04 PM11/12/08
to
On Nov 10, 6:51 pm, Bruce <bdarcus.li...@gmail.com> wrote:

> 1) a script within the word-processor that collects and updates the
> citations and bibliographic entries
>
> 2) a CSL processor, which reads a CSL style into some internal model,
> and then takes the relevant metadata and processes the finished
> citations and bibliographic entries for insertion in #1


I've written a few XML/XSLT systems (and the more painful SGML before
XML existed) and I had a read around the CSL/Citeproc docs. I can't
see that CSL is useful at the moment as a full bibliographic solution
for TeX/LaTeX. It's basically an XML schema with an XSLT 2.0 processor
on top. This means it has nothing to do with TeX semantics - it would
be a purely syntactic conversion as far as TeX is concerned, which
would make sophisticated things impossible. It would have to dump the
result into some sort of file, the closest format being .bbl but then
you still need the .sty layer to do the syntax->typesetting semantics
layer. The XSLT transformation layer of CSL-based Citeproc is XSLT 2
which means it has the "unparsed-text()" function but that's not
really much use here. The unparsed-text() function really just allows
XSLT to read in non-XML (the nasty neologism used is "transclusion")
and so, yes, you could suck in TeX or something but what would you do
with it? Parse it outside of TeX? I think this shows the problem
really - yes you could technically do something with CSL and TeX and
"replace" BiBTeX/biblatex but that would pretty much amount to re-
implementing enormous chunks of them in XSLT or whatever and there is
no point to that and let's face it, nobody is going to do it.

You *could* use CSL to just get you a .bbl file and then use an
existing bib package (natbib etc.) to format the results. This would
just cut out bibtex as a data processor which was the point of the
original debate - this would work but anything more than that and you
start to get into the typsetting semantics which CSL doesn't know
anything about at all (and shouldn't). It's a bit disturbing that
Citeproc quotes disparagingly most other solutions like BibTeX:

"In all cases, formatting is tied directly to the application."

To do anything sophisticated, especially with citations and citation/
bib interaction, it *must* be so tied. Like Philipp Lehman said - how
could you do per-page/para/chapter etc. ibidem tracking otherwise? How
would you specify that only things cited twice or more go in the bib?
The beauty of a tight coupling between the typsetting engine and the
bib processor is that much more can be automated which means much
easier and more accurate implementation of specific bib styles.

In the particular TeX case in hand, the choice is between having a
nice language for specifying styles (XML - nice, TeX - not so nice)
and having one which sits at the right semantic level for really
complex and sophisticated interaction with the typesetting (XML -
useless, completely the wrong level, TeX - perfect). Can't have both -
general systems can't be the best (or sometimes even adequate) at
specific tasks.

So:

1. As a BibTeX data model replacement, CSL/Citeproc is ok.

2. As a data translator into an amenable TeX format like .bbl, also
probably ok but you'd need a TeX/LaTeX style and a few TeX/LaTeX runs
to do something with the output file, as usual.

3. As a a further step, generating TeX/LaTeX macros to typeset the
citations/references, yes you could do a few things but they'd turn
out to be pretty basic and messy because you would have no access to
any TeX compile-time data. That was the whole motivation for biblatex
in the first place.

PK

Bruce

unread,
Nov 12, 2008, 6:01:26 PM11/12/08
to
I really didn't want to get drawn into a long discussion about csl and
citeproc, and was more intending to just fill in some blanks, but ...

On Nov 12, 3:51 pm, philk...@gmail.com wrote:
> On Nov 10, 6:51 pm, Bruce <bdarcus.li...@gmail.com> wrote:
>
> > 1) a script within the word-processor that collects and updates the
> > citations and bibliographic entries
>
> > 2) a CSL processor, which reads a CSL style into some internal model,
> > and then takes the relevant metadata and processes the finished
> > citations and bibliographic entries for insertion in #1
>
> I've written a few XML/XSLT systems (and the more painful SGML before
> XML existed) and I had a read around the CSL/Citeproc docs. I can't
> see that CSL is useful at the moment as a full bibliographic solution
> for TeX/LaTeX. It's basically an XML schema with an XSLT 2.0 processor
> on top.

The XSLT processing stuff is effectively a proof-of-concept. The key
idea is to abstract citation processing instructions away from low-
level formatting details so that one could use the same style in
different contexts.

Zotero wrote their own Javascript-based implementation and I have
often wondered if it would be possible/sensible to write a version in
Lua that could hook up with LuaTeX.

> This means it has nothing to do with TeX semantics

Yes, by design; I consider that a feature.

[...]

> You *could* use CSL to just get you a .bbl file and then use an
> existing bib package (natbib etc.) to format the results. This would
> just cut out bibtex as a data processor which was the point of the
> original debate - this would work but anything more than that and you
> start to get into the typsetting semantics which CSL doesn't know
> anything about at all (and shouldn't). It's a bit disturbing that
> Citeproc quotes disparagingly most other solutions like BibTeX:
>
> "In all cases, formatting is tied directly to the application."

Why is that disturbing? It's true, and it's a big problem for authors
like me who live in a world of the web. and publishers who don't
accept TeX files.

> To do anything sophisticated, especially with citations and citation/
> bib interaction, it *must* be so tied. Like Philipp Lehman said - how
> could you do per-page/para/chapter etc. ibidem tracking otherwise?

I have never come across such a style, and it hasn't yet come up.

> How would you specify that only things cited twice or more go in the bib?

Add it as an option.

> The beauty of a tight coupling between the typsetting engine and the
> bib processor is that much more can be automated which means much
> easier and more accurate implementation of specific bib styles.

Right, but at the expense of how much time to write a new style? And
usable only with LaTeX. For most of the world, that's a deal-breaker.

> In the particular TeX case in hand, the choice is between having a
> nice language for specifying styles (XML - nice, TeX - not so nice)
> and having one which sits at the right semantic level for really
> complex and sophisticated interaction with the typesetting (XML -
> useless, completely the wrong level, TeX - perfect). Can't have both -
> general systems can't be the best (or sometimes even adequate) at
> specific tasks.

That may be.

Bruce

Simon Spiegel

unread,
Nov 12, 2008, 6:23:35 PM11/12/08
to
On 2008-11-12 17:19:47 +0100, rf...@cl.cam.ac.uk (Robin Fairbairns) said:

> Simon Spiegel <si...@remove.simifilm.ch> writes:
>> On 2008-11-10 17:48:17 +0100, rf...@cl.cam.ac.uk (Robin Fairbairns) said:
>>> the specific question i asked was whether there was some description
>>> of csl that i had missed (on the basis that i don't read xml schemas
>>> as a means to learning about things' semantics ... i can get an xml
>>> schema without problem, but it doesn't necessarily help).
>>>
>>> i had vaguely hoped that a spec of the semantics (rather than merely
>>> the syntax) of csl might be out there somewhere.
>>>
>>> since you ducked my question, i'm guessing there isn't. perhaps one
>>> has to be initiated into a magic circle (or something else arcane) to
>>> get to learn about the format.
>>
>> Comparing the number of CSL styles that have been made public since the
>> release of Zotero and the number .bst styles published during the same
>> time, I'd estimate that it's not so difficult to enter the magic CSL
>> circle, at least much easier than learning .bst voodoo.
>
> there are an awful lot of bib styles on ctan, let alone any that
> people keep hidden away. since i don't even know how long csl has
> been around, i find it impossible to make the comparison.

Zotero which ATM definitely is the most important implementation of a
CSL parser was released as a public beta in October 2006. AFAICS
they're mentioning the creation of CSL styles by users for the first
time in a blog entry on September 10, 2007
(http://www.zotero.org/blog/zotero-gets-your-style/ ). Maybe Bruce
posted information on this subject earlier on his blog. Anyway, you get
an idea how long it has been here.


> hand-waving arguments, therefore, don't get us anywhere. i'm
> definitely not interested in working on csl styles -- what i want to
> know is whether i should be pointing people at csl, in places like the
> tex faq. so far, the evidence seems to be that even a passing mention
> would be out of place.

Maybe I'm getting something wrong here, but it's my impression that you
have an unncessary hostile tone in this thread. Honestly, I have no
idea why.

simon


Simon Spiegel

unread,
Nov 12, 2008, 6:28:55 PM11/12/08
to

I think we now agree that *for LaTeX* biblatex is the better solution
to typeset the citations/references. But, the point of CSL is, as Bruce
says, that you get a style which is usable across different systems.
I'd say that for many users, this is a big plus, and there probably are
a lot of scenarios where the benefit of an universally usable style
could outweighs the loss in typographic control.

simon

Robin Fairbairns

unread,
Nov 13, 2008, 6:37:38 AM11/13/08
to

so we're agreed (i paraphrase) that latex users should forget about
csl, on the grounds that we already have a better solution available?

i had imagined that there might be a way forward, but since i've not
found any documentation of anything i'm inclined to forget anyone ever
told me about csl.
--
Robin Fairbairns, Cambridge

Simon Spiegel

unread,
Nov 13, 2008, 7:23:25 AM11/13/08
to
On 2008-11-13 12:37:38 +0100, rf...@cl.cam.ac.uk (Robin Fairbairns) said:

> Simon Spiegel <si...@remove.simifilm.ch> writes:
>> On 2008-11-12 21:51:04 +0100, phil...@gmail.com said:
>>> So:
>>>
>>> 1. As a BibTeX data model replacement, CSL/Citeproc is ok.
>>>
>>> 2. As a data translator into an amenable TeX format like .bbl, also
>>> probably ok but you'd need a TeX/LaTeX style and a few TeX/LaTeX runs
>>> to do something with the output file, as usual.
>>>
>>> 3. As a a further step, generating TeX/LaTeX macros to typeset the
>>> citations/references, yes you could do a few things but they'd turn
>>> out to be pretty basic and messy because you would have no access to
>>> any TeX compile-time data. That was the whole motivation for biblatex
>>> in the first place.
>>
>> I think we now agree that *for LaTeX* biblatex is the better solution
>> to typeset the citations/references. But, the point of CSL is, as Bruce
>> says, that you get a style which is usable across different systems.
>> I'd say that for many users, this is a big plus, and there probably are
>> a lot of scenarios where the benefit of an universally usable style
>> could outweighs the loss in typographic control.
>
> so we're agreed (i paraphrase) that latex users should forget about
> csl, on the grounds that we already have a better solution available?

That's not really what I said, no. I guess it really depends on the
specific case of use.

ATM this is theoretical anyway, as there is no implementation of CSL
for LaTeX. But if there was, I guess it would really depend on the
given scenario.

For example, if someone has a big bibtex database, his own carefully
adapted biblatex style and no need to exchange his data with someone
outside the TeX world, there certainly is no reason to use CSL (which
was actually exactly the situation I was in when I wrote my book).

But if someone already has a big database established with Zotero or
another CSL compatible setup and there's a CSL style available for the
specific journal or whatever he needs, why shouldn't he use CSL (if
there was a CSL implementation for LaTeX, of course)? I mean there
certainly would be no point in recreating a style in biblatex if a
usable CSL style was already available.

> i had imagined that there might be a way forward, but since i've not
> found any documentation of anything i'm inclined to forget anyone ever
> told me about csl.

As I see it, CSL has just started. Zotero, which is still quite young,
has been a huge success and sees constant improvement. And if I
understood correctly, the next version of the OpenDocument format will
offer support for CSL. Add the growing discontent with Endnote (and its
limitations) to that, and I see the possibility of a quite a bright
future for CSL. Of course, this is all just beginning, but – I'm
repeating myself – CSL is still young.

As for documentation: What exactly, are you looking for? There's the
Zotero site Bruce mentioned, there's a Zotero developer user group,
there's the xbiblio mailing list. And there are many style to try and
to analyze.

Simon

Bruce

unread,
Nov 13, 2008, 9:14:53 AM11/13/08
to
On Nov 13, 6:37 am, r...@cl.cam.ac.uk (Robin Fairbairns) wrote:

...

> so we're agreed (i paraphrase) that latex users should forget about
> csl, on the grounds that we already have a better solution available?

Define "better" perhaps, and then there might be room for agreement.

> i had imagined that there might be a way forward, but since i've not
> found any documentation of anything i'm inclined to forget anyone ever
> told me about csl.

Robin, I've been patient with you, avoiding being drawn in by your
provocative rhetoric. But here you're just ignoring facts entirely. In
direct reply to your request for documentation, I pointed you to
some.

I'll ask the same question that Simon asked of you, what's with the
hostility here?

Bruce

Robin Fairbairns

unread,
Nov 13, 2008, 10:46:34 AM11/13/08
to
Bruce <bdarcu...@gmail.com> writes:

>On Nov 13, 6:37=A0am, r...@cl.cam.ac.uk (Robin Fairbairns) wrote:
>...
>
>> so we're agreed (i paraphrase) that latex users should forget about
>> csl, on the grounds that we already have a better solution available?
>
>Define "better" perhaps, and then there might be room for agreement.
>
>> i had imagined that there might be a way forward, but since i've not
>> found any documentation of anything i'm inclined to forget anyone ever
>> told me about csl.
>
>Robin, I've been patient with you, avoiding being drawn in by your
>provocative rhetoric. But here you're just ignoring facts entirely. In
>direct reply to your request for documentation, I pointed you to
>some.

sigh. i had missed the part of your post that pointed me at zotero (i
had scanned it and then saved it for [fully] reading later ... and
"later" hasn't arrived yet). as you note, it's not what i asked for,
but it's something.

>I'll ask the same question that Simon asked of you, what's with the
>hostility here?

as i hinted above, there seems to be general agreement (that i can't
understand) that the csl model can't work with biblatex -- i.e., can't
separate the citations from the references and either from the
typesetting, and that therefore we (tex users) are "doomed" for ever
to stay with biblatex. (which i believe to be a splendid piece of
work.)

repeatedly telling us that csl (or similar) is the way of the future
(and hence ignoring our typographic concerns), has got me wound up.

i've probably overdone it. i shall withdraw from this thread
forthwith.
--
Robin Fairbairns, Cambridge

Dan Luecking

unread,
Nov 13, 2008, 6:13:57 PM11/13/08
to
On Thu, 13 Nov 2008 06:14:53 -0800 (PST), Bruce
<bdarcu...@gmail.com> wrote:

>On Nov 13, 6:37 am, r...@cl.cam.ac.uk (Robin Fairbairns) wrote:
>
>...
>
>> so we're agreed (i paraphrase) that latex users should forget about
>> csl, on the grounds that we already have a better solution available?
>
>Define "better" perhaps, and then there might be room for agreement.
>
>> i had imagined that there might be a way forward, but since i've not
>> found any documentation of anything i'm inclined to forget anyone ever
>> told me about csl.
>
>Robin, I've been patient with you, avoiding being drawn in by your
>provocative rhetoric. But here you're just ignoring facts entirely. In
>direct reply to your request for documentation, I pointed you to
>some.

Robin hasn't used any provocative rhetoric that I can see.
He has shown what seems like frustration (perhaps the same
as I have felt) in trying to discover, from the answers here
and the pointers given, what CSL actually *is* and what it
can do.

The documentation you've pointed at seems designed for
someone who already knows something. I can't even tell
what it is I don't know that I need to know. I was led
(as a start) to find out what "XML Schema" means (also
XSLT 2 and "namespace").

I actually found a pretty good starter document on Schema,
but so far don't know anything about
- how to obtain a working CSL,
- what to feed it,
- what it is capable of doing for me (w.r.t. TeX),
- how to train it to do what it is capable of.

I am fairly patient and will wait until I've read some
more before giving up, but I can certainly see robin's
problem.

>
>I'll ask the same question that Simon asked of you, what's with the
>hostility here?

I have detected not one bit of hostility. Why with the
accusative tone?

donate...@gmail.com

unread,
Nov 13, 2008, 9:23:34 PM11/13/08
to
On Nov 12, 8:19 am, r...@cl.cam.ac.uk (Robin Fairbairns) wrote:

> what i want to
> know is whether i should be pointing people at csl, in places like the
> tex faq.  so far, the evidence seems to be that even a passing mention
> would be out of place.

Right now, a passing mention would only be able to state:
(1) pandoc and citeproc-hs can produce a LaTeX-formatted bibliography
(although this isn't even as nice as a .bbl file)
(2) CSL might be one (tedious) method to produce a BibTeX file from
another data source (though there are probably better ways)

A mention now probably isn't worthwhile. If some implementation of
citeproc was made to export .bbl, I think it would warrant a passing
mention (due mostly to the volume of styles that already exist after a
short time).

If someone actually improved the interaction of CSL/citeproc with
LaTeX, it would definitely be worth looking at & may warrant something
more than a passing mention.

In short: it would be short-sighted to think that CSL is a non-starter
that has no future in a LaTeX workflow, but it is also probably
premature to add it to the FAQ.

--
MK

donate...@gmail.com

unread,
Nov 13, 2008, 9:45:03 PM11/13/08
to
We can approach this from a developer perspective or from an end user
perspective. End users who are writing only in LaTeX probably don't
need to know anything about CSL at this time. You seem to have
approached the topic from this stance and, to a certain extent, so has
Robin. Developers who are looking to build new tools _might_ be
interested in CSL: there are already numerous citation styles written
in it, it is relatively easy to write new ones, and the format is used
outside of the LaTeX community.

> I was led (as a start) to find out what "XML Schema" means (also
> XSLT 2 and "namespace").

Sounds similar to my first exposure to BST files (and some of this
thread does sound like other conversations about BST & Word/OO.o).

Also note that "XSLT" is a bit of a red herring, as it is used in only
one proof-of-concept implementation of CiteProc, the tool that uses
CSL to format citations. XSLT is not used (as Phil may have
incorrectly implied) in more recent (and more complete)
implementations of CiteProc.


>   - how to obtain a working CSL,

The same way you obtain a working BST: download it from someone else
(such as http://www.zotero.org/styles/) or build one yourself (using
either a style generator or by hand-coding XML).


>   - what to feed it

CSL is just a set of instructions for how to format citations. You'd
feed a CSL file, your document, and a citation database to an
implementation of CiteProc (probably Zotero or citeproc-hs at this
time) & would generate another document.


>   - what it is capable of doing for me (w.r.t. TeX),

As above, not much right now: you can generate LaTeX-formatted markup
or can use it as a round-about way to end up with a BibTeX file. In
the future, maybe more (but this would require better tools, as
highlighted in this thread).

--
MK

Robin Fairbairns

unread,
Nov 14, 2008, 6:24:46 AM11/14/08
to
donate...@gmail.com writes:
>[...]

>In short: it would be short-sighted to think that CSL is a non-starter
>that has no future in a LaTeX workflow, but it is also probably
>premature to add it to the FAQ.

thanks.

if i were nearer to retirement, i might think of adding a
latex-citeproc to my list of projects. while i'm still working, i've
too many other things to do (including keeping ctan working and
getting the catalogue into shape ... not to mention the faq).
--
Robin Fairbairns, Cambridge

Dan Luecking

unread,
Nov 14, 2008, 12:27:01 PM11/14/08
to
On Thu, 13 Nov 2008 18:45:03 -0800 (PST), donate...@gmail.com
wrote:

I apologize for the number of questions below. You don't need to
answer any of them, of course. Pointers to basic (preferably
self-contained) documentation would be especially welcome.

>We can approach this from a developer perspective or from an end user
>perspective. End users who are writing only in LaTeX probably don't
>need to know anything about CSL at this time. You seem to have
>approached the topic from this stance and, to a certain extent, so has
>Robin. Developers who are looking to build new tools _might_ be
>interested in CSL: there are already numerous citation styles written
>in it, it is relatively easy to write new ones, and the format is used
>outside of the LaTeX community.

Developers need to know how things will affect users. They also
need to use a system themselves, if only to test a development.
If I seem to be mainly user oriented, its because I know so little
and need that info as a first step to understanding. If I only
wanted to write citation styles, I'd still need to know how a
user would use them. Your last sentence suggests that LaTeX users
might be able to share in a wider community, if only someone who
knew both TeX and CSL could develop an appropriate tool.

>
>> I was led (as a start) to find out what "XML Schema" means (also
>> XSLT 2 and "namespace").
>
>Sounds similar to my first exposure to BST files (and some of this
>thread does sound like other conversations about BST & Word/OO.o).

But I can read the documentation of the .bst language (and point
anyone to that documentation, a single file). As near as I can
tell, between it and the documentation of the .bib format (in
the latex manual) it is self-contained.

>
>Also note that "XSLT" is a bit of a red herring, as it is used in only
>one proof-of-concept implementation of CiteProc, the tool that uses
>CSL to format citations. XSLT is not used (as Phil may have
>incorrectly implied) in more recent (and more complete)
>implementations of CiteProc.
>
>
>>   - how to obtain a working CSL,
>
>The same way you obtain a working BST: download it from someone else
>(such as http://www.zotero.org/styles/) or build one yourself (using
>either a style generator or by hand-coding XML).

I thought CSL was a system: people have been contrasting
it to bibtex. From the above, it seems to me that downloading
a CSL is only part of the system, and therefore could hardly
be said to be "working". Just as downloading a BST doesn't
give one anything without BiBTeX.

With bibtex I can
- obtain a working bibtex by installing TeX Live or MiKTeX.
- I feed it .bst styles and .bib databases.
- It is capable of sorting, parsing names, etc., and writing
arbitrary text (usually some sort of bibliography environment
complete with \bibitem commands for each entry).
- I train it by writing a .bst or finding one that does what I
want.

You seem to be saying that CSL is the equivalent of the
second half of the last step.

Is citeproc the equivalent of bibtex in the above? If so,
it hasn't been made clear (to me) in this thread.

I can add a 5th point:
- I *use* bibtex by running "bibtex filename" where
filename.aux contains the name of the database,
the .bst and the citation keys. (Usually, but not
necessarily, this file is written by LaTeX macros.)

How do I *use* citeproc (or zotero or whatever)?

>
>>   - what to feed it
>
>CSL is just a set of instructions for how to format citations. You'd
>feed a CSL file, your document, and a citation database to an
>implementation of CiteProc (probably Zotero or citeproc-hs at this
>time) & would generate another document.

What does CiteProc do with the document itself? Is a citeproc
implementation tied to a particular document preparation system?

Bibtex, for example, uses only information supplied by the user.
I could feed it that information without any document and it would
happily write me a .bbl file. With the right .bst it could
probably write it with troff markup instead of TeX.

What is the format of the citation database? With bibtex it is
the .bib file, which is independent of the bst. I assume the
database for citeproc consists of xml markup, is it independent
of the csl? Independent of the citeproc implementation?

Also .bib files are "open-ended". That is, a bibliographic
record can contain arbitrary fields. A BST determines which
ones are used. Is that true of the citeproc/csl/database
system?

>>   - what it is capable of doing for me (w.r.t. TeX),
>
>As above, not much right now: you can generate LaTeX-formatted markup

Hell, that's exactly what a .bst does. That, plus sorting
and data manipulation (changing case, parsing names).

>or can use it as a round-about way to end up with a BibTeX file. In

The phrase "Bibtex file" is ambiguous. Do you mean a .bib database?

>the future, maybe more (but this would require better tools, as
>highlighted in this thread).

If this is all that CiteProc could ever do, there still seems
it has one up on bibtex: a lot more document processing systems
could be served from the same citation database.

Thank you very moch for the information you have provided.

Simon Spiegel

unread,
Nov 14, 2008, 4:02:05 PM11/14/08
to
>
>
>> We can approach this from a developer perspective or from an end user
>> perspective. End users who are writing only in LaTeX probably don't
>> need to know anything about CSL at this time. You seem to have
>> approached the topic from this stance and, to a certain extent, so has
>> Robin. Developers who are looking to build new tools _might_ be
>> interested in CSL: there are already numerous citation styles written
>> in it, it is relatively easy to write new ones, and the format is used
>> outside of the LaTeX community.
>
> Developers need to know how things will affect users. They also
> need to use a system themselves, if only to test a development.
> If I seem to be mainly user oriented, its because I know so little
> and need that info as a first step to understanding. If I only
> wanted to write citation styles, I'd still need to know how a
> user would use them. Your last sentence suggests that LaTeX users
> might be able to share in a wider community, if only someone who
> knew both TeX and CSL could develop an appropriate tool.

That's correct.

>
>
> Is citeproc the equivalent of bibtex in the above? If so,
> it hasn't been made clear (to me) in this thread.

CSL is only the styling language, citeproc is the processor.


>
> I can add a 5th point:
> - I *use* bibtex by running "bibtex filename" where
> filename.aux contains the name of the database,
> the .bst and the citation keys. (Usually, but not
> necessarily, this file is written by LaTeX macros.)
>
> How do I *use* citeproc (or zotero or whatever)?

I guess the easiest way to find out is to actually try Zotero. Zotero
itself is a fullfledged bibliographic app. It stores its data in a
SQLite database and to format its content it uses CSL. CSL styles can
be compared to .bst files or Endnote .ens style files.


>>
>> CSL is just a set of instructions for how to format citations. You'd
>> feed a CSL file, your document, and a citation database to an
>> implementation of CiteProc (probably Zotero or citeproc-hs at this
>> time) & would generate another document.
>
> What does CiteProc do with the document itself? Is a citeproc
> implementation tied to a particular document preparation system?
>
> Bibtex, for example, uses only information supplied by the user.
> I could feed it that information without any document and it would
> happily write me a .bbl file. With the right .bst it could
> probably write it with troff markup instead of TeX.
>
> What is the format of the citation database? With bibtex it is
> the .bib file, which is independent of the bst. I assume the
> database for citeproc consists of xml markup, is it independent
> of the csl? Independent of the citeproc implementation?
>
> Also .bib files are "open-ended". That is, a bibliographic
> record can contain arbitrary fields. A BST determines which
> ones are used. Is that true of the citeproc/csl/database
> system?

I'm not completely sure, but I think at least in theory you could feed
any kind of data of data to CSL system properly adapted to it. As I
said, Zotero uses a SQLite database, so in a sense there is no real
data format. AFAIK there's also an implementation for MODS files. I
might be wrong, but I think the question of the input format is only a
question of the specific implementation. I think about any kind of data
could be converted to something CSL can handle, the question here is
only whether the respective fields can be properly mapped (I mean, you
could probably write a program which would use bst files and which
would take a different input format than bibtex).

simon

donate...@gmail.com

unread,
Nov 14, 2008, 5:27:32 PM11/14/08
to
On Nov 14, 9:27 am, Dan Luecking <LookIn...@uark.edu> wrote:
> Pointers to basic (preferably
> self-contained) documentation would be especially welcome.

Unfortunately, this area still needs work. Basic information
(including links) can be found at:
http://en.wikipedia.org/wiki/Citation_Style_Language
http://www.zotero.org/support/dev/creating_citation_styles

Zotero's wiki page <http://www.zotero.org/support/dev/
csl_syntax_summary> is probably the closest thing to Oren Patashnik's
"Designing BibTeX Styles."


> Developers need to know how things will affect users.

Yes, of course. My main point was that Simon raised "CSL" as a
potential future successor to BibTeX (requiring further development) &
much of the frustration in the thread seemed to come from those that
were looking into using CSL with LaTeX right now.


> But I can read the documentation of the .bst language (and point
> anyone to that documentation, a single file).

It is actually interesting to trace Oren's 1988 manual to Patrick
Daly's 1999 documents to more recent documents, such as the LaTeX
companion and Nicolas Markey's tutorial "Taming the BeaST." (I
certainly hope that it would take less than a decade to have CSL+LaTeX
work more nicely together & less than two decades for end-user
documentation that was easy to understand!).

I agree that .bst is better documented than the newer CSL!


> I thought CSL was a system: people have been contrasting
> it to bibtex. From the above, it seems to me that downloading
> a CSL is only part of the system, and therefore could hardly
> be said to be "working".

Yes, CSL is a part of a system (just as .BST files are, it is the part
that describes how to format citations). The word "BibTeX," has
(perhaps unfortunately) been used to describe:
(1) The whole referencing system
(2) The .bib flat-file database format
(3) Various versions of the 'bibtex' program
and even, occasionally, (4) the BibTeX Style Templates (BST).

'CSL' only describes an equivalent to (4). The nearest equivalents to
the others:
(1) probably 'XBib' <http://xbiblio.sourceforge.net/>, but most
implementations aren't hosted there.
(2) perhaps the bibliographic ontology <http://bibliontology.com/>,
but there are no implementations of this yet & CiteProc often has
other input formats (see below)
(3) most often 'citeproc' with some link to the language it was
implemented in.
(4) CSL


Allow me to draw equivalents. It is difficult, given the multiple
implementations of CiteProc+CSL. We will look at:
(a)Zotero
(b)pandoc+citeproc-hs

> With bibtex I can
With (a) or (b), you can...

> - obtain a working bibtex by installing TeX Live or MiKTeX.

(a) Install Firefox+Zotero
(b) install haskell, citeproc-hs, and pandoc

> - I feed it .bst styles and .bib databases.

(a) Feed Zotero with CSL styles and your Zotero database (which can be
fed MODS, .bib, .ris, and other diverse formats).
(b) Feed pandoc+citeproc-hs with CSL styles and MODS XML

> - It is capable of sorting, parsing names, etc., and writing
> arbitrary text (usually some sort of bibliography environment
> complete with \bibitem commands for each entry).

Zotero and pandoc+citeproc-hs are capable of sorting, parsing names,
and writing arbitrary text, but primary formats are currently:
(a) text, HTML, RTF, or through plugins to word processors
(b) MarkDown (although PanDoc has numerous other import/export
formats, including LaTeX)

> - I train it by writing a .bst or finding one that does what I
> want.

You train it by writing a .csl or finding one that does what you want.


> - I *use* bibtex by running "bibtex filename" where
> filename.aux contains the name of the database,
> the .bst and the citation keys. (Usually, but not
> necessarily, this file is written by LaTeX macros.)

(a) You use a graphical user interface that uses the formats
enumerated, above.
(b) You run 'pandoc --csl apa.csl --mods modsCollection.xml filename'
'apa.csl' describes how citations are formatted
'modsCollection.xml' has reference information (equivalent to a .bib
file)
'filename' is most often a markdown-formatted file that uses cite
tags resembling [Smith99; Jones01@ p. 10] (note the ability to specify
a 'locator' within a text.) Pandoc reads and writes other formats
(HTML, LaTeX, etc.).


> What does CiteProc do with the document itself? Is a citeproc
> implementation tied to a particular document preparation system?

If you think of "CiteProc" as a "meta project," it is agnostic to
document preparation systems. There are multiple implementations of
CiteProc & each different implementation is able to use different
document preparation systems.


> Bibtex, for example, uses only information supplied by the user.
> I could feed it that information without any document and it would
> happily write me a .bbl file.

CiteProc is similar.


> With the right .bst it could
> probably write it with troff markup instead of TeX.

The same can be done in CSL, by making a style that uses structured
text (such as the CSL file that generates a BibTeX .bib database).
However, since particular implementations of CiteProc understand
various document systems, there's no reason to put this lexical
formatting in a CSL file. Instead, you use semantic information that
is transformed to a specific language by CiteProc.


> What is the format of the citation database? With bibtex it is
> the .bib file, which is independent of the bst. I assume the
> database for citeproc consists of xml markup, is it independent
> of the csl? Independent of the citeproc implementation?

Again, this is implementation-specific (see above). It is not tied to
XML; Zotero uses their own database. Many implementations currently
support MODS XML, though.


> Also .bib files are "open-ended". That is, a bibliographic
> record can contain arbitrary fields. A BST determines which
> ones are used. Is that true of the citeproc/csl/database
> system?

Not only are some supported database formats extensible, but they can
be be truly hierarchical.


> >As above, not much right now: you can generate LaTeX-formatted markup
>
> Hell, that's exactly what a .bst does. That, plus sorting
> and data manipulation (changing case, parsing names).

Yes-and-no. Because no CiteProc implementation currently has a
complete/native understanding of LaTeX, the .bbl files are quite a bit
more useful (containing semantic information, leaving cite commands in
place, etc.) than current LaTeX generated by pandoc/citeproc-hs.


> >or can use it as a round-about way to end up with a BibTeX file. In
>
> The phrase "Bibtex file" is ambiguous. Do you mean a .bib database?

Yes.

--
MK

Tom Dye

unread,
Nov 14, 2008, 8:36:48 PM11/14/08
to
Sorry to jump in late, but I am interested in the discussion because it
seems to be leading to a better way to manage bibtex/biblatex
databases. This is one of the hard things about bibtex, the other
being changing a style or writing a new style to conform to some
standard. I haven't had time to investigate biblatex fully, but I am
confident from what I've read that it will take care of the style
problem when the LaTeX community develops a critical mass of biblatex
styles.

It is already possible to manage a bibtex database in Zotero, which
will export to bibtex database format from its internal format. The
ability with Zotero to capture references from, say, library web sites
already reduces the amount of time spent entering bibliographic data,
though many references obtained this way need to be cleaned up a bit
before they are really useful.

It is interesting to me that Zotero uses an SQLite database for its
internal storage; this suggests that it would be possible to use
standard database management tools on the bibliographic data, in
addition to the rather limited management facilities present already in
Zotero.

AFAIK, there is as yet no Zotero to biblatex converter. It would be
useful to have one. Even better would be a mechanism to query the
Zotero database directly so the biblatex end-user wouldn't have to
export from Zotero every so often.

I think the ideal bibliographic data system would:
1) capture references from the web,
2) store them in a way that could be managed with standard database tools,
3) provide a user-friendly data entry environment, and
4) be directly accessible by biblatex.

Tom


--
Tom Dye
T. S. Dye & Colleagues, Archaeologists, Inc.
Honolulu, Hawai`i

Simon Spiegel

unread,
Nov 15, 2008, 2:47:40 AM11/15/08
to
On 2008-11-15 02:36:48 +0100, Tom Dye <t...@tsdye.com> said:

> Sorry to jump in late, but I am interested in the discussion because it
> seems to be leading to a better way to manage bibtex/biblatex
> databases. This is one of the hard things about bibtex, the other
> being changing a style or writing a new style to conform to some
> standard. I haven't had time to investigate biblatex fully, but I am
> confident from what I've read that it will take care of the style
> problem when the LaTeX community develops a critical mass of biblatex
> styles.
>
> It is already possible to manage a bibtex database in Zotero, which
> will export to bibtex database format from its internal format. The
> ability with Zotero to capture references from, say, library web sites
> already reduces the amount of time spent entering bibliographic data,
> though many references obtained this way need to be cleaned up a bit
> before they are really useful.
>
> It is interesting to me that Zotero uses an SQLite database for its
> internal storage; this suggests that it would be possible to use
> standard database management tools on the bibliographic data, in
> addition to the rather limited management facilities present already in
> Zotero.
>
> AFAIK, there is as yet no Zotero to biblatex converter. It would be
> useful to have one. Even better would be a mechanism to query the
> Zotero database directly so the biblatex end-user wouldn't have to
> export from Zotero every so often.

Actually, there is. Zotero can export BibTeX files. There even is, for
OSX users, a script which will automatically transfer newly added
Zotero entries to BibDesk, a popular BibTeX GUI. The problem with the
BibTeX is the same all bibliographic apps have: It's limited to a quite
small, standardized amount of fields. If you use biblatex, for example,
and heavily rely on its additional fields, there's still a lot of
manual work needed after the export. But basic BibTeX is here, and I
think has been here since the first release of Zotero.

Simon


Tom Dye

unread,
Nov 15, 2008, 11:41:27 AM11/15/08
to

Yes, I think we agree. A more capable Zotero-like application to
manage the biblatex database and a mechanism to allow biblatex to query
the database in the Zotero-like app's native format (so the end-user
could skip the export step) would be deluxe.

Tom

donate...@gmail.com

unread,
Nov 16, 2008, 6:17:38 PM11/16/08
to
On Nov 12, 3:01 pm, Bruce <bdarcus.li...@gmail.com> wrote:
> Zotero wrote their own Javascript-based implementation and I have
> often wondered if it would be possible/sensible to write a version in
> Lua that could hook up with LuaTeX.

While this is a great longer-term goal, I wonder how difficult it
would be to use citeproc-py with CrossTeX:
http://www.cs.cornell.edu/people/egs/crosstex/

--
MK

phil...@gmail.com

unread,
Nov 17, 2008, 12:44:18 PM11/17/08
to
On Nov 15, 5:41 pm, Tom Dye <t...@tsdye.com> wrote:

> Yes, I think we agree.  A more capable Zotero-like application to
> manage the biblatex database and a mechanism to allow biblatex to query
> the database in the Zotero-like app's native format (so the end-user
> could skip the export step) would be deluxe.

I still think that we're not really understanding something fairly
crucial here: this isn't just about data models and stateless
conversions. At some point the data has to be typeset - you have to
generate some semantic output. Now it's a common problem in AI that
stateless transitions where you have a tool that outputs a file which
is processed by another tool which outputs another file etc. are nice,
clean and general but don't work for complex things. Like doing
sophisticated citations and references. We are talking as if a .bbl
file is the typeset bibliography but all it is really is a subset of
the .bib data presented in some macros that do some of the typesetting
(not even that in biblatex). Typically, you need a .sty file in
addition to do anything complex with the .bbl. It's the semantics in
the .sty file (or the biblatex .cbx and .bbx files) which do all the
complex stuff needed to actually generate typeset material. Just look
at the biblatex formatting style files or even just plain old BiBTeX
natbib.sty. This is the meat which actually typesets the data and it's
completely wedded to the typesetting system (here, TeX). There is no
way at all to duplicate most of this level of things in any general
CSL-like implementation.

I think it's easy to think that this problem is less hard than it
looks because there are "plug-ins" for various WP systems. Typically,
the plug-ins just spit out RTF or Word COM commands or whatever which
tell the app what to print for the bibliography. This is fine for
simple things because you don't need state from the typesetting of the
rest of the document. In fact, it's generally hard to do such stateful
things in most WP apps anyway (unless you start messing about with the
horrors of VBscript etc. which are really nasty ways of doing
typesetting coding if you've ever tried).

In brief, the assumption here is that you can take some data from your
bib database, coerce it into macros which will print something and do
this completely independently of the processing of the rest of the
document. This is only possible for fairly generic bibliographies
without much integrated citation support etc. HOWEVER, of course you
*can* extend a generic system IN PRINCIPLE to cover the harder things.
But then all you are doing is creating a non-generic system. That's
why these arguments that "Citeproc could be extended to do that" are a
bit beside the point. Of course any vaguely Turing-complete anything
can be extended to do anything vaguely Turing computable but this is
then a completely different question altogether. We started by talking
about what to replace the data backend and model of biblatex with and
end up saying "we could replace any bibliography system with any other
if we worked on it enough". Of course that's true but it has nothing
to with the original question!

Here's a quick example since people were asking for one. The APA style
manual requires that for citations, the year is only mentioned the
first time the citation is used in a paragraph. This is impossible to
do in a general citation/bib system because it doesn't understand the
notion of "paragraph", nor does it know how the typesetting system
implements the notion. You have to be tightly bound in state and
syntax to the document typesetting processing in order to do this.
Running a general bib system on a list of mentioned citations which
then outputs a file can't do this. You could augment it with hooks
into the processing and special syntax and intermediate file formats
etc. etc. but then it's not a general system any more and you're just
duplicating (almost certainly poorly too) parts of the typesetting
system you are supposed to be providing a general service to.

We need to understand at what level a generic system has to stop and
give way to non-generic domain-specific issues. Is there a general way
to provide *all* aspects of bibliography and citation needs to a large
selection of document formats? No. There is a certainly a general way
to provide *some* aspects, like the data model and format etc. and
that's what we started talking about in the first place. You can
cobble together some plug-ins to try to increase the general coverage
of a general system but this never works that well on anything other
than fairly simple examples. No reasonably sophisticated LaTeX user
has ever had automated and solid results from any "LaTeX converter"
ever made. Of course you can do something "ok" in terms of outputting
LaTeX macros using purely syntactic conversions but since we have
biblatex which does an excellent job, why would we want to replace
this level of the processing with something which could only ever be
average by comparison?

My experience with these things is that you only really learn this by
trying to build general systems and then failing to match the
performance of specific systems so perhaps there isn't much point in
talking about this until someone tries to build a generalised biblatex
replacement and fails ...

Simon Spiegel

unread,
Nov 17, 2008, 2:03:09 PM11/17/08
to

If CSL (or any other system) would actually provide a system which
would allow users to use the same data and the same style files for
different applications, this would already be a huge improvement over
the current situation IMO – even if it was only for "fairly generic
bibliographies".

Look at the current situation: The standard outside the LaTeX world is
still Endnote which is not only a horrible application but is also
extremely limited in its style files (I just had to learn that again
last week when I was asked to create a generic style for our
institute). Compared to what Endnote currently offers, CSL is in a
completely different league. Besides Endnote there are many other
smaller apps with proprietary formats (for example Sente and Bookends
on the Mac). Interestingly enough, one of the most widely supported
import/export formats among these apps is BibTeX ... But no matter
which way your conversion will go, you will end up doing a lot of
cleaning by hand, because as soon as you go beyond the most simple
fields like 'author' or 'title', there are no two apps (or data
formats) which are alike. It's even worse with style files: Outside of
the LaTeX world, a .bst file wont help you at all, and every app has
its own style format. Actually, I think Zotero is now the first app
which is actually able to read Endnote's .ens files (which brought them
a lawsuit from Thomson ...).

IMO, even a fairly generic solution would already mean a big
improvement for anyone who uses different applications to write text
and has to do exchange in several formats.

simon

Bruce

unread,
Nov 17, 2008, 2:32:11 PM11/17/08
to
On Nov 17, 12:44 pm, philk...@gmail.com wrote:
> On Nov 15, 5:41 pm, Tom Dye <t...@tsdye.com> wrote:
>
> > Yes, I think we agree.  A more capable Zotero-like application to
> > manage the biblatex database and a mechanism to allow biblatex to query
> > the database in the Zotero-like app's native format (so the end-user
> > could skip the export step) would be deluxe.
>
> I still think that we're not really understanding something fairly
> crucial here: this isn't just about data models and stateless
> conversions.

I think what he's asking for may well not be that complex; maybe a
script that can keep an updated bib file in sync from a Zotero db?

But that aside, on to your bigger/other point ...

> At some point the data has to be typeset - you have to
> generate some semantic output. Now it's a common problem in AI that
> stateless transitions where you have a tool that outputs a file which
> is processed by another tool which outputs another file etc. are nice,
> clean and general but don't work for complex things. Like doing
> sophisticated citations and references.

But the current evidence suggests that the general approach of CSL is
working pretty well. There are some corner case exceptions that I
don't want to ignore, but CSL has managed to represent most real world
(e.g. "sophisticated") styles quite well.

> We are talking as if a .bbl
> file is the typeset bibliography but all it is really is a subset of
> the .bib data presented in some macros that do some of the typesetting
> (not even that in biblatex). Typically, you need a .sty file in
> addition to do anything complex with the .bbl. It's the semantics in
> the .sty file (or the biblatex .cbx and .bbx files) which do all the
> complex stuff needed to actually generate typeset material. Just look
> at the biblatex formatting style files or even just plain old BiBTeX
> natbib.sty. This is the meat which actually typesets the data and it's
> completely wedded to the typesetting system (here, TeX). There is no
> way at all to duplicate most of this level of things in any general
> CSL-like implementation.

As above, I guess I'd challenge you to find some real examples of
these hypothetical limitations of the "general CSL-like
implementation."

> I think it's easy to think that this problem is less hard than it
> looks because there are "plug-ins" for various WP systems. Typically,
> the plug-ins just spit out RTF or Word COM commands or whatever which
> tell the app what to print for the bibliography. This is fine for
> simple things because you don't need state from the typesetting of the
> rest of the document. In fact, it's generally hard to do such stateful
> things in most WP apps anyway (unless you start messing about with the
> horrors of VBscript etc. which are really nasty ways of doing
> typesetting coding if you've ever tried).
>
> In brief, the assumption here is that you can take some data from your
> bib database, coerce it into macros which will print something and do
> this completely independently of the processing of the rest of the
> document.

I wouldn't go so far as to say "completely independently." Certainly
dealing with issues like first/subsequent and ibid handling, or note-
based citations, requires some consideration of the document. My point
is simply that I don't accept that citation and bibliographic style
configuration MUST be dependent on the low-level details of a
typesetting language.

> This is only possible for fairly generic bibliographies
> without much integrated citation support etc. HOWEVER, of course you
> *can* extend a generic system IN PRINCIPLE to cover the harder things.
> But then all you are doing is creating a non-generic system.

Not sure how the latter follows from the former. For sake of argument,
let's say you need to be able to define the scope of ibid handling for
a page, rather than for the document. There's no way now to do that in
CSL. But, there's no reason one couldn't propose such an addition,
which others could then implement.

<option name="ibid-scope" value="page"/>

There's no magic that means that will work in any particular
implementation without code, of course, but there's still room for
that sort of evolution at the style level.

> That's why these arguments that "Citeproc could be extended to do that" are a
> bit beside the point. Of course any vaguely Turing-complete anything
> can be extended to do anything vaguely Turing computable but this is
> then a completely different question altogether. We started by talking
> about what to replace the data backend and model of biblatex with and
> end up saying "we could replace any bibliography system with any other
> if we worked on it enough". Of course that's true but it has nothing
> to with the original question!

I haven't been involved in this whole thread, so apologies if I've
contributed to getting things off track ;-)

> Here's a quick example since people were asking for one. The APA style
> manual requires that for citations, the year is only mentioned the
> first time the citation is used in a paragraph.

I'm not understanding. Do you mean if I have ...

"According to Doe (1999), x, y, z. But a is also true (Doe, 1999)."

... that the output result should be ...

"According to Doe (1999), x, y, z. But a is also true (Doe)."

... ?

> This is impossible to do in a general citation/bib system because it doesn't understand the
> notion of "paragraph", nor does it know how the typesetting system
> implements the notion. You have to be tightly bound in state and
> syntax to the document typesetting processing in order to do this.
>
> Running a general bib system on a list of mentioned citations which
> then outputs a file can't do this. You could augment it with hooks
> into the processing and special syntax and intermediate file formats
> etc. etc. but then it's not a general system any more  and you're just
> duplicating (almost certainly poorly too) parts of the typesetting
> system you are supposed to be providing a general service to.
>
> We need to understand at what level a generic system has to stop and
> give way to non-generic domain-specific issues. Is there a general way
> to provide *all* aspects of bibliography and citation needs to a large
> selection of document formats? No. There is a certainly a general way
> to provide *some* aspects, like the data model and format etc. and
> that's what we started talking about in the first place. You can
> cobble together some plug-ins to try to increase the general coverage
> of a general system but this never works that well on anything other
> than fairly simple examples.

Again: this is more assertion than fact at this point. I guess it
really depends on what you call "fairly simple examples," but I think
you're exaggerating based on my own experience.

> No reasonably sophisticated LaTeX user
> has ever had automated and solid results from any "LaTeX converter"
> ever made. Of course you can do something "ok" in terms of outputting
> LaTeX macros using purely syntactic conversions but since we have
> biblatex which does an excellent job, why would we want to replace
> this level of the processing with something which could only ever be
> average by comparison?

I'm not sure what to make of all the value-laden language of
"sophisticated" vs. "simple" and "excellent" vs. "average." If you're
happy spending all your time writing LaTeX-specific styles, and you
don't have publishers or editors refusing to accept those documents,
then sure. That's not the world I live in though.

> My experience with these things is that you only really learn this by
> trying to build general systems and then failing to match the
> performance of specific systems so perhaps there isn't much point in
> talking about this until someone tries to build a generalised biblatex
> replacement and fails ...

Fine with me.

Bruce

Tom Dye

unread,
Nov 17, 2008, 11:05:25 PM11/17/08
to

Yes, a script that kept a biblatex database in sync with a Zotero-like
db would be neat. I'm glad Bruce doesn't think that's a complex
thing--maybe the script will be written someday! (There's also the
matter of capturing biblatex's level of detail in the Zotero-like db).

On the biblatex vs CSL question, it's really not either/or. One of the
heartening things about the maturation of biblatex is that there will
be a choice of bibliography systems in the TeX world. I'm eager to
find the time to make the transition from bibtex to biblatex, something
that will happen when various biblatex styles have been developed and
are stable. If a CSL solution were available and had advantages over
the other two, in the same way that biblatex has advantages over
bibtex, then I'd consider that, too.

Bruce makes a valid point about difficulties with editors and
publishers accepting LaTeX documents, but I can't see that this is
something that a CSL implementation of the bibliography system is going
to solve. I'll be happy to be wrong about this!

phil...@gmail.com

unread,
Nov 18, 2008, 6:48:53 PM11/18/08
to
On Nov 17, 8:32 pm, Bruce <bdarcus.li...@gmail.com> wrote:

> But the current evidence suggests that the general approach of CSL is
> working pretty well.

I suppose it depends what you mean. It works pretty well in covering
many bibliography styles, and data certainly. But as I say, this data
needs to go onto paper or a screen at some point and you can't expect
it to be generally good at doing this for Word, RTF, LaTeX, HTML etc.
without invoking *as part of the bib processing* any of their
respective functionalities. It's like expecting a general purpose
cleaning product to be as good on say, silver, as one specifically
made for silver. It can be good enough and if you are aiming for cost-
savings and coverage, then it may well be a better choice. I'm really
only trying to keep clear what the compromises are.

> I don't accept that citation and bibliographic style
> configuration MUST be dependent on the low-level details of a
> typesetting language.

I think this is an empirical matter really - try to make a general
system and I guarantee that this is what you'll find more and more as
you extend it. You can eliminate the "must" if you stay fairly
generic. That's why we end up with specialised languages and
formalisms all over the place - because a general solution is never
adequate for details (even though it can be good and be perfectly fine
for most cases).

> <option name="ibid-scope" value="page"/>
>
> There's no magic that means that will work in any particular
> implementation without code, of course, but there's still room for
> that sort of evolution at the style level.

This is just syntax though - I can't see how a style could do anything
with this since unless you are "inside" a typesetting system run in
LaTeX, for example - what does "page" mean? This is the issue -
outside of the specific system which writes the dots, "page" doesn't
really mean anything - what size of page, what font size, what inter-
line spacing, what justification etc.? I don't think we're actually
disagreeing much here - I think that you can quite easily specify
things as you say and pass such date to the typesetting algorithms to
do the work but then we have a clear demarcation between the data
model and the typesetting. Which means "replacing biblatex" is a
misnomer - we are only talking about replacing the data model.

> I'm not understanding. Do you mean if I have ...
>
> "According to Doe (1999), x, y, z. But a is also true (Doe, 1999)."
>
> ... that the output result should be ...
>
> "According to Doe (1999), x, y, z. But a is also true (Doe)."

Exactly, but automated so that the cite command for both is the same,
with underlying code detecting whether you are still in the same
paragraph for the second cite. The only way I think you can do this in
something like CSL is to set some attribute on citation extraction
which means having some sort of detection of when you are in a
different paragraph in the document source. This is unreliable because
the source isn't (in TeX) necessarily a good guide to the printed
output (\marginpar, for example). It also means more and more specific
code to help CSL deal with TeX. This is fine but then you are making a
non-generic systems and we already have a good one of those -
biblatex.

> I'm not sure what to make of all the value-laden language of
> "sophisticated" vs. "simple" and "excellent" vs. "average." If you're
> happy spending all your time writing LaTeX-specific styles, and you
> don't have publishers or editors refusing to accept those documents,
> then sure. That's not the world I live in though.

Right, and that's the split in this debate - it started about LaTeX
only and ended up as a general bib system debate which means there is
a lot of cross-purpose discussion. If I needed a system to move around
between document prep systems, I wouldn't use .bib files either (but
I'd convert to them so I could use biblatex because I think it's never
going to be matched by a general purpose system for LaTeX documents).

donate...@gmail.com

unread,
Nov 18, 2008, 7:27:51 PM11/18/08
to
Wouldn't an implementation of CiteProc that was implemented in LuaTeX
be privy to all information that BibLaTeX is using currently?
Similarly, if CrossTeX is already using custom python-based
bibliographic styles, why wouldn't it be possible to serve the same
function using a CSL file and citeproc-py?

CiteProc isn't a one-size-fits-all generic solution; it has several
implementations. It is general where it makes sense to be general
(e.g. "write one set of style instructions in CSL," and "use one
database for bibliographic metadata"), and is specialized where it
makes sense to be specialized (e.g. "If using MS Word, use a plugin to
that system"). BibLaTeX is specialized where it does not need to be:
it forces authors to write style instructions that can be used in no
other programs.

Yes, the results will be lousy if you set pandoc loose on a .tex file
& it uses the exact same code as it does when you run it on
a .markdown file. No, that is not how a LaTeX-based CiteProc
implementation would work.

--
MK

Simon Spiegel

unread,
Nov 19, 2008, 2:54:35 AM11/19/08
to
On 2008-11-19 00:48:53 +0100, phil...@gmail.com said:
>
>
>> <option name="ibid-scope" value="page"/>
>>
>> There's no magic that means that will work in any particular
>> implementation without code, of course, but there's still room for
>> that sort of evolution at the style level.
>
> This is just syntax though - I can't see how a style could do anything
> with this since unless you are "inside" a typesetting system run in
> LaTeX, for example - what does "page" mean? This is the issue -
> outside of the specific system which writes the dots, "page" doesn't
> really mean anything - what size of page, what font size, what inter-
> line spacing, what justification etc.? I don't think we're actually
> disagreeing much here - I think that you can quite easily specify
> things as you say and pass such date to the typesetting algorithms to
> do the work but then we have a clear demarcation between the data
> model and the typesetting. Which means "replacing biblatex" is a
> misnomer - we are only talking about replacing the data model.

I don't really understand the problem here. Of course, an
implementation of biblatex *for LaTeX* would know what a page is *in
LaTeX*. It's the same when a CSL demands something to be printed in
italics. For this to work, CSL for LaTeX needs to know that
"font-style="italic" must be translated into \emph{}. IMO That's not a
fundemental problem but just the normal issues every CSL implementation
will have to face.

As the one originally responsible for the start of this subthread, I
have to say: That's exactly the point. You wouldn't use .bib files to
move data around between different systems. Well, what would you use?
The problem is exactly that there is no unifying standard, that most
data formats are limited and are hard to map to each other. Biblatex is
a blessing because it offers all kind of new fields compared to
standard LaTeX, but if you have to get that data into another system,
much of it will be lost. And that was originally the point I was trying
to make: If Zotero or CSL actually leads to a well-defined and rich
*data format* which can be used across different systems, this would be
a huge win for biblatex. You're right, that is only about the data
format and not the formatting part of CSL, but if we could reach
compatibility here, that would already be a huge win.

But – and I already said this earlier – this is really purely
theoretical for now. There is no CSL implementation for LaTeX; and as
it is currently implemented in Zotero, CSL is still too limited, both
in terms of data model and styling language. There is, for example, no
way to match biblatex's 'bookauthor' in Zotero/CSL ATM, which almost
makes it useless for me. If I understand correctly, Zotero 1.5 should
bring a much improved, hierarchical data model, but we'll have to wait
for that.

simon


Mariano Suárez-Alvarez

unread,
Nov 19, 2008, 9:47:38 AM11/19/08
to
On Nov 19, 5:54 am, Simon Spiegel <si...@remove.simifilm.ch> wrote:

> On 2008-11-19 00:48:53 +0100, philk...@gmail.com said:
>
>
>
>
>
> >> <option name="ibid-scope" value="page"/>
>
> >> There's no magic that means that will work in any particular
> >> implementation without code, of course, but there's still room for
> >> that sort of evolution at the style level.
>
> > This is just syntax though - I can't see how a style could do anything
> > with this since unless you are "inside" a typesetting system run in
> > LaTeX, for example - what does "page" mean? This is the issue -
> > outside of the specific system which writes the dots, "page" doesn't
> > really mean anything - what size of page, what font size, what inter-
> > line spacing, what justification etc.? I don't think we're actually
> > disagreeing much here - I think that you can quite easily specify
> > things as you say and pass such date to the typesetting algorithms to
> > do the work but then we have a clear demarcation between the data
> > model and the typesetting. Which means "replacing biblatex" is a
> > misnomer - we are only talking about replacing the data model.
>
> I don't really understand the problem here. Of course, an
> implementation of biblatex *for LaTeX* would know what a page is *in
> LaTeX*. It's the same when a CSL demands something to be printed in
> italics. For this to work, CSL for LaTeX needs to know that
> "font-style="italic" must be translated into \emph{}. IMO That's not a
> fundemental problem but just the normal issues every CSL implementation
> will have to face.

Using biblatex allows you to dothings like formatting the first
reference to a book differently than the other references to
the same book that appear on the same page (or on the same
spread?) You can do the evil "idem" and "ibidem" thingies
some formats are so fond of. And so on.

That kind of things are simply not doable from outside the
typesetting engine.

Biblatex is *much* more than a data source.

-- m

Simon Spiegel

unread,
Nov 19, 2008, 10:09:41 AM11/19/08
to
On 2008-11-19 15:47:38 +0100, Mariano Suárez-Alvarez
<mariano.su...@gmail.com> said:

I know biblatex pretty well and love it. I think I can even brag with
the fact that I wrote the first commercially published book which made
heavy use of biblatex. So I know what it's capable of. And I agree that
there is some finegraining which probably is not portable. But the idem
stuff is not among this.

Let's say CSL had an option like <option name="ibid-scope"
value="page"/> which would describe that ibids or ibidems are only
created when the references are on the same page. I don't see why this
couldn't be done. Again: CSL or citeproc is not an app which itself
handles all kind of document formats, it must be implemented for each
word processor or typesetter. So if CSL was implemented in LaTeX (let's
say with LuaTeX) it wouldn't be "outside the typesetting engine". It
would have access to this kind of stuff, it would know where on the
page the citation is. The same if you implemented this option for Word.
I guess Word knows where on the page something is and makes this data
available (or maybe not, then this option would have to be ignored in
this specific implementation). So if CSL provided syntax to describe
this kind of thing it would be up to different implementations to make
this happen. I certainly don't see any reason why this shouldn't be
possible.


simon

0 new messages