Improving BibTeX export
May 4 2011, 11:53 pm
spartan <spartan...@gmail.com>
Wed, 4 May 2011 20:53:07 -0700 (PDT)
Local: Wed, May 4 2011 11:53 pm
Subject: Improving BibTeX export
This is a re-post of my post at zotero-forum:
http://forums.zotero.org/discussion/17847/improving-bibtex-export/

I love zotero so much. But got disturbed by the poor support of
bibtex. So I spent quite some time to improve the functionality of the
bibtex export. It should be very useful for those who use bibtex/latex
a lot, in particular for scientists like me. But I don't know where I
should post my modified codes.

Here is the brief list of improvements from my hard work:
1. much more flexible (user control) bibtex key generation: from
author names, initials, title, journal, year, volume, pages ,...etc
2. the suffix for key collisions can be numeric or alphabetic
3. key format strings can be read from prefs
(extention.zotero.bibtex....)
4. attachments like pdf files can be exported with real path links
only. now you can easily get the resulting bibtex file work with other
external applications like JabRef and Mendeley.
5. unicode conversions for greek letters
6. user specified field like callNumber to export pre-stored keys
... and other improvements.

It's been working pretty well for me. But I'd like to share it with
your guys.
Please drop me a note if somebody knows how to get these improvements
added to the next release of zotero.

spartan

May 5 2011, 12:52 am
Avram Lyon <ajl...@gmail.com>
Thu, 5 May 2011 08:52:03 +0400
Local: Thurs, May 5 2011 12:52 am
Subject: Re: Improving BibTeX export
Glad to hear you're so dedicated to making BibTeX support stronger.

On Thu, May 5, 2011 at 7:53 AM, spartan <spartan...@gmail.com> wrote:
> I love zotero so much. But got disturbed by the poor support of
> bibtex. So I spent quite some time to improve the functionality of the
> bibtex export. It should be very useful for those who use bibtex/latex
> a lot, in particular for scientists like me. But I don't know where I
> should post my modified codes.

Please post the code to http://gist.github.com/ and post a link to the
list, so the other BibTeX devs can take a look.

Avram

May 5 2011, 2:25 am
spartan <spartan...@gmail.com>
Wed, 4 May 2011 23:25:44 -0700 (PDT)
Local: Thurs, May 5 2011 2:25 am
Subject: Re: Improving BibTeX export
Here is the link:
Public Clone URL:       git://gist.github.com/956623.git

The readme file is attached below for your convenience:

===============
For the better of the research world that enjoys Zotero, Latex/BibTeX,
JabRef, Mendeley, SciPlore MindMapping, etc.

updated: May 4, 2011
************************
* spartan...@gmail.com *
************************

===============
List of files:

reame -- this file
bibtex_js_and_path.gz -- tarball of everything
BibTeXTan.js -- to replace the translator BibTeX.js
BibTeXTan.patch -- patch file relative to BibTeX.js of zotero v2.1.6
BibTeXKeyOnly.js -- new translator for quickcopy / drag&drop export of
cite keys
BibTeXTanKeyOnly.patch -- patch file relative to BibTeXTan.js
item_local.js -- to replace content/zotero/xpcom/translation/
item_local.js
item_local.patch -- patch file to that of zotero v2.1.6
translate.js -- to replace content/zotero/xpcom/translation/
translate.js
translate.patch -- patch file to that of zotero v2.1.6

===============
New Features and Improvements on BibTeX Export:

1. very flexible bibtex key format definition using the fields of
creators,
title, journal, year, volume, pages. including options of lower/upper
cases,
initials, lengths, n-th author/word picking, etc. it is also easy to
extend
to include new fields for key generation.

2. to resolve possible key collisions, users can specify their own
suffix
either alphabetic or numeric.

3. a special field (e.g., callNumber or shortTitle) can be dedicated
for export
of keys either pre-assigned or stored that are better done manually.
Until
zotero provides a dedicated BibTeX key field or some type of local id
field,
this is still a very appealing hack for most latex/bibtex users.

4. attachments like pdf files can be exported as file links only. The
local
real path is stored in the file field. No need to export the large
files
themselves. Keeping two copies on the machine is not good, in
particular
for people who like to make comments on pdf files. This is also making
zotero
cooperating better with other external applications like JabRef and
SciPlore.
The real file export still works as expected if the option of
exportFileData is
set to true. Only a small change in item_local.js is needed for such a
benefit.

5. unicode conversions for greek letters.

6. the strings of controlling the 1,2,3 features are easily modified
through
browser prefs: extensions.zotero.bibtexKeyFormat,
extensions.zotero.bibtexKeyCollisionFormat,
extensions.zotero.bibtexKeyField.
No need to touch the translator itself for user's own definition. No
formats
will be lost after upgrade. All of these only require a small
modification in
translate.js. It also opens up access to prefs for all the other
translators
which could be potentially very useful for others as well.

7. A Cite Key only stripped down translator is provided as well in a
similar way
that Andrew Leifer did. This is very convenient for quickCopy or
drag&drop in
latex editing.

8. some other minor improvements for bibtex export such as better
treatment for
Some examples:
%au4%yr2%tt2
%AU_%yr_%TI
%Au4-%Au4{1}:%JJ%yr:Vol%vo:Pg%pg-%pg{1}

%Au4{1} = first 4 letters of 2nd creator's surname (first letter capitalized)
%ti6{2} = first 6 letters of 3rd word in title (all lowercase)
%AA4 = initials of first 4 creators' surnames (uppercase)
%tt6 = first letters of first 6 words in title (lowercase)
%JJ5 = first letters of first 5 words in journal name (uppercase)
%yr = %yr4 = four-digit year, %yr2 = two-digit year
%vo = volume number
%pg = first page number
%pg{1} = last page number

===============
Collision Format:
%a for alphabetic suffix, %n for numeric suffix

var citeKeyCollisionFormat = Zotero.getPrefs("bibtexKeyCollisionFormat") ?
Zotero.getPrefs("bibtexKeyCollisionFormat") : "-%n";

===============
Specified Key Field: e.g. callNumber for using the stored key

var citeKeyField = Zotero.getPrefs("bibtexKeyField"); //e.g. callNumber Some examples: %au4%yr2%tt2 %AU_%yr_%TI %Au4-%Au4{1}:%JJ%yr:Vol%vo:Pg%pg-%pg{1} %Au4{1} = first 4 letters of 2nd creator's surname (first letter capitalized) %ti6{2} = first 6 letters of 3rd word in title (all lowercase) %AA4 = initials of first 4 creators' surnames (uppercase) %tt6 = first letters of first 6 words in title (lowercase) %JJ5 = first letters of first 5 words in journal name (uppercase) %yr = %yr4 = four-digit year, %yr2 = two-digit year %vo = volume number %pg = first page number %pg{1} = last page number =============== Collision Format: %a for alphabetic suffix, %n for numeric suffix var citeKeyCollisionFormat = Zotero.getPrefs("bibtexKeyCollisionFormat") ? Zotero.getPrefs("bibtexKeyCollisionFormat") : "-%n"; =============== Specified Key Field: e.g. callNumber for using the stored key var citeKeyField = Zotero.getPrefs("bibtexKeyField"); //e.g. callNumber You must Sign in before you can post messages. To post a message you must first join this group. May 5 2011, 2:48 am

Avram Lyon <ajl...@gmail.com>

Thu, 5 May 2011 10:48:16 +0400

Subject: Re: Improving BibTeX export

I look forward to the comments of our BibTeX contributors, but I wanted to point out the two changes to Zotero core code that you're proposing:

1. Zotero.getPrefs(..), a new function in the sandbox that allows access to arbitrary preferences from the Zotero section of the user's prefs (patch to translate.js)

2. Expose the local path of attachments through the "path" attribute on the attachment object (patch to item_local.js)

I'm a little concerned about the first one, since I don't see a justification for exposing all the user's preferences to all translators. This also has some implications for the future portability of the translator to a server-side or in-connector translator sandbox. Couldn't we just keep BibTeX key editing internal to the file and have people edit a constant there?

The second proposed change seems quite reasonable.

- Avram This is also making > zotero > cooperating better with other external applications like JabRef and > SciPlore. > The real file export still works as expected if the option of > exportFileData is > set to true. Only a small change in item_local.js is needed for such a > benefit. File exports should already work with JabRef. Haven't tested Mendeley or SciPlore. Have you? What specific issues were you running into? This implementation differs from the file export of other translators. Current implementation also allows you to share both the bibtex file and attachments with other bibtex users who use other machines. I think exportFileData should be obeyed. I don't know if I disagree with the option of exporting a file that has links to zotero's copy of files, but I don't think this implementation is what we want right now. > 5. unicode conversions for greek letters. These are in mathmode, which is probably not ideal. You also only have one-way conversions right now. A lot of users of greek use, at minimum, babel. Many use inputenc. Export would work as-is for that. (And most greek-specific or international-specific tools would work even better.) Don't know the right answer for this: I'm not sure of any other program that transliterates greek into tex entities. Do you have examples? > 8. some other minor improvements for bibtex export such as better > treatment for > latex special characters like$,\,_,^, etc.

Again: can you explain the issues you're trying to address?
Preferably with an example (containing input text, output of the two
BibTeX translators, and how LaTeX/BibTeX would mis-render them)?

Thanks again for the patch!

--Rick

May 5 2011, 11:46 am
Richard Karnesky <karne...@gmail.com>
Thu, 5 May 2011 08:46:26 -0700 (PDT)
Local: Thurs, May 5 2011 12:08 pm
Subject: Re: Improving BibTeX export
I think the bibtex export needs also better patent support. I added
this :

if(item.patentNumber) {
writeField("note", item.patentNumber);
writeField("howpublished", "Patent");
}

As suggested here :http://see-out.com/sandramau/bibpat.html

What do you think ?

Edouard.

On May 5, 5:46 pm, Richard  Karnesky <karne...@gmail.com> wrote:

May 5 2011, 12:08 pm
spartan <spartan...@gmail.com>
Thu, 5 May 2011 09:19:09 -0700 (PDT)
Local: Thurs, May 5 2011 12:19 pm
Subject: Re: Improving BibTeX export
Avram, thanks for your comments.

for point 1:
The reason for the new function Zotero.getPrefs() is to easily store
user's
setting somewhere. So they don't lose their setting from the next
upgrade.
Users really hate to remember their complicated settings in somewhere
else
like a sticky note and to manually edit the file (e.g. BibTeX.js in
this case)
when the next upgrade of zotero happens. If it can be done some other
way
like getOption() and the GUI, it would be fantastic. But I don't know
how to
achieve that.

If people are really concerned about exposing all zotero-related
preferences
to all translators. We can define a more specific function like,

"getBibtexPrefs":function(translate, pref) {
...
return Zotero.Prefs.get("bibtex"+pref);

}

In this case, only bibtex related preferences get exposed. What do you
think?

-spartan

On May 5, 2:48 am, Avram Lyon <ajl...@gmail.com> wrote:

May 5 2011, 12:47 pm
Avram Lyon <ajl...@gmail.com>
Thu, 5 May 2011 20:47:32 +0400
Local: Thurs, May 5 2011 12:47 pm
Subject: Re: Improving BibTeX export
spartan--

On Thu, May 5, 2011 at 8:19 PM, spartan <spartan...@gmail.com> wrote:

[..]

> If people are really concerned about exposing all zotero-related
> preferences
> to all translators. We can define a more specific function like,

> "getBibtexPrefs":function(translate, pref) {
> ...
> return Zotero.Prefs.get("bibtex"+pref);
> }

> In this case, only bibtex related preferences get exposed. What do you
> think?

I agree with your intent here, and I hope we can find a workable
solution. If the key generation settings are going to be in the Zotero
preferences, I think we should probably handle it as a
non-BibTeX-specific subtree of the preferences:

"getPref":function(translate, pref) {
...
return Zotero.Prefs.get("translator."+pref);

}

We can imagine more uses for such preferences for other translators as
well, although I think we'd have to be very careful about what uses we
allow in translators that ship with Zotero. The added freedom might
prove useful for internal-use translators being created for specific
users' or institutions' workflows.

I flagged this not because I disagree with it, but rather because
changes to the core client go through a different review process, and
I wanted to make sure that the core devs took a look at these specific
proposed changes and weighed in on them.

Avram

May 5 2011, 12:51 pm
spartan <spartan...@gmail.com>
Thu, 5 May 2011 09:51:16 -0700 (PDT)
Local: Thurs, May 5 2011 12:51 pm
Subject: Re: Improving BibTeX export

I am not talking about the actual file exports. That is still working
as
expected. What I did was to have the native path of the file saved in
the 'file' field of bibtex when people set exportFileData = false. In
this
way when you play the generated bibtex file using some external
applications you will have access to the attachments that still reside
in the zotero dataDir. It is not a good idea in general to keep two
copies of attachment data. Especially when people do a lot of
comments on those pdf files. It will be hard to keep the copies synced
all the time. The real file copying function is not lost when you
set exportFileData = true. It is the same as before so you can still
share your data in the same way with friends,etc.

> > 5. unicode conversions for greek letters.

> These are in mathmode, which is probably not ideal.  You also only
> have one-way conversions right now.  A lot of users of greek use, at
> minimum, babel.  Many use inputenc.  Export would work as-is for
> that.  (And most greek-specific or international-specific tools would
> work even better.)  Don't know the right answer for this: I'm not sure
> of any other program that transliterates greek into tex entities.  Do
> you have examples?

The scientific publications contain greek letters everywhere. Most of
the corresponding websites using unicode. The unicode conversion tool
is already in the translator. What I did was to put additional lines
for
the very common greek letters used widely in science. for example,
the greek alpha -> $\alpha$, the capital greek GAMMA -> $\Gamma$.
The resulting bibtex file will be more compatible with latex.

> > 8. some other minor improvements for bibtex export such as better
> > treatment for
> > latex special characters like $,\,_,^, etc. > Again: can you explain the issues you're trying to address? > Preferably with an example (containing input text, output of the two > BibTeX translators, and how LaTeX/BibTeX would mis-render them)? It is only one line change in BibTeX.js. Normally we don't want to escape some very common latex special characters like$ and \. When these
letters are in the fields like title, abstract, you want to keep it as
it is.
for example, it is very common $\alpha$ appears in title. The current
BibTeX.js will convert it to \$\\alpha\$ which is unwanted. These
kinds
of things are better kept untouched. The one situation that we do need
to escape a $sign is, for cases like$1,000.00 which is considered in
the new code.

May 5 2011, 1:22 pm
spartan <spartan...@gmail.com>
Thu, 5 May 2011 10:22:16 -0700 (PDT)
Local: Thurs, May 5 2011 1:22 pm
Subject: Re: Improving BibTeX export

Great!. That is a better solution.

> We can imagine more uses for such preferences for other translators as
> well, although I think we'd have to be very careful about what uses we
> allow in translators that ship with Zotero. The added freedom might
> prove useful for internal-use translators being created for specific
> users' or institutions' workflows.

> I flagged this not because I disagree with it, but rather because
> changes to the core client go through a different review process, and
> I wanted to make sure that the core devs took a look at these specific
> proposed changes and weighed in on them.

I hope the core devs will adopt something similar or find some other
good way to store user settings.

A little bit off the topic, but...The lack of strong and better
support of bibtex/latex is the only thing that prevents a lot of
scientific researchers like me to fully embrace zotero. I hope that
the situation will get better and better. I guess that the latex users
are a pretty large group of zotero users. I see zotero posts related
to bibtex or latex everywhere. Unfortunately most of these guys
are so busy with their research projects. Not many of them can
devote their time to improve the bibtex support in zotero. I am not
a programmer but a normal busy researcher. I got so frustrated in
the situation. So I dived in and learned javascript myself and
started modifying the codes. I hope more researchers will join
in and make the bibtex support the best part of zotero.

-spartan

May 5 2011, 1:44 pm
Richard Karnesky <karne...@gmail.com>
Thu, 5 May 2011 10:44:08 -0700 (PDT)
Local: Thurs, May 5 2011 1:44 pm
Subject: Re: Improving BibTeX export

> I think the bibtex export needs also better patent support.

I think we should probably just use a @patent type.  This is not one
of the item types that  originally documented in 'BibTexing', but it
is tremendously popular.  JabRef & BibLaTeX use it, as do a ton
of .bst styles.

--Rick

May 5 2011, 2:47 pm
Richard Karnesky <karne...@gmail.com>
Thu, 5 May 2011 11:47:11 -0700 (PDT)
Local: Thurs, May 5 2011 2:47 pm
Subject: Re: Improving BibTeX export

> > > 4. attachments like pdf files can be exported as file links only.

As I see it, there are three possibilities:
(1) don't export information about attachments
(2) copy attachments and export information about them
(3) export information about the local copies.

Current behavior is (1) or (2).  Your proposed behavior is (2) or
(3).  I'm not sure if this result is intuitive based on the "Export
Files" option.  Perhaps, at minimum, this should be rephrased?

File export is not unique to BibTeX & behavior (3) does not happen in
other translators that can export files.  Should it?  Do people value
having option (1)?  It has the advantage of privacy & also being able
to import the file onto another system without the brittle local
paths.  Should we therefore allow any of these three choices?

Or perhaps BibTeX is special: it is more likely to be used by a
researcher on his own machine, as it is not only a data exchange
format, but a format that is used in document preparation.

As I said: this raises issues that should be discussed....

> > > 5. unicode conversions for greek letters.

Let me note that the Unicode<->LaTeX entity transliteration originated
in 'refbase', which I co-lead (though Matthias Steffens was the author
of that code).  I've patched that section of code within Zotero.
After double-checking, I am only slight embarrassed to see that I
forgot that 'refbase' also exports greek characters in math mode.  If
it is good enough for that, it is good enough for this :-).  (But I'll
also note that Elsevier, Springer, AIP, and many others do NOT do
this).

HOWEVER:  you only added entries to the mapping table & not to the
reverse table.  This means import/export is not reversible.  We should
add entries to the other table too.

> > > 8. some other minor improvements for bibtex export such as better
> > > treatment for
> > > latex special characters like $,\,_,^, etc. I am opposed to this change. I do realize that some other BibTeX users really want it. I've helped them implement something similar in their personal copies of the translator. But it runs with a very different philosophy than Zotero is made with. For the clarity of others: the proposed change effects export by allowing BibTeX users to use BibTeX-specific markup that will be preserved on export. Encouraging this markup would break sharing of the references with anything that is not BibTeX, as people would write LaTeX entities into zotero fields, rather than using the what-you-say- is-what-you-get-everywhere UTF-8 equivalents. Yes, if you write 'á' in Zotero, you should get '$\alpha$' in BibTeX. Patching the mapping tables would get you that. However, people should not write '$\alpha$' directly in a Zotero field! While accepting this change would make Zotero more expressive to LaTeX/ BibTeX users, it wouldn't help people who use e.g. the word processor plugins or html output or other ways of using references. It'd make references less shareable. The right way to fix this is to address: https://www.zotero.org/trac/ticket/439 See: http://forums.zotero.org/discussion/5324/bibtex-and-greek-characters/ and many others for philosophical discussion. --Rick You must Sign in before you can post messages. To post a message you must first join this group. Please update your nickname on the subscription settings page before posting. You do not have the permission required to post. More options May 5 2011, 3:04 pm From: "Bruce D'Arcus" <bdar...@gmail.com> Date: Thu, 5 May 2011 15:04:58 -0400 Local: Thurs, May 5 2011 3:04 pm Subject: Re: Improving BibTeX export On Thu, May 5, 2011 at 2:47 PM, Richard Karnesky <karne...@gmail.com> wrote: >> > > 8. some other minor improvements for bibtex export such as better >> > > treatment for >> > > latex special characters like$,\,_,^, etc.

> I am opposed to this change.  I do realize that some other BibTeX
> users really want it.  I've helped them implement something similar in
> their personal copies of the translator.  But it runs with a very
> different philosophy than Zotero is made with.

> For the clarity of others:  the proposed change effects export by
> allowing BibTeX users to use BibTeX-specific markup that will be
> preserved on export.  Encouraging this markup would break sharing of
> the references with anything that is not BibTeX, as people would write
> LaTeX entities into zotero fields, rather than using the what-you-say-
> is-what-you-get-everywhere UTF-8 equivalents.

Yes, definitely agree with Rick's position here, and strongly against
in any way treating bibtex as special for purposes of data entry (use
UTF8) or key/label generation (which is relevant for a variety of
formats).

Bruce

May 5 2011, 6:27 pm
spartan <spartan...@gmail.com>
Thu, 5 May 2011 15:27:28 -0700 (PDT)
Local: Thurs, May 5 2011 6:27 pm
Subject: Re: Improving BibTeX export

On May 5, 2:47 pm, Richard  Karnesky <karne...@gmail.com> wrote:

Yes, you are right. My implementation omits option (1). maybe we
can put it in a user setting like bibtexAttachmentsLocalPath in the
preferences. Maybe even set it to false as default. Then users will
have
control if they want to export the local path information or not.

> > > > 5. unicode conversions for greek letters.

> Let me note that the Unicode<->LaTeX entity transliteration originated
> in 'refbase', which I co-lead (though Matthias Steffens was the author
> of that code).  I've patched that section of code within Zotero.
> After double-checking, I am only slight embarrassed to see that I
> forgot that 'refbase' also exports greek characters in math mode.  If
> it is good enough for that, it is good enough for this :-).  (But I'll
> also note that Elsevier, Springer, AIP, and many others do NOT do
> this).

> HOWEVER:  you only added entries to the mapping table & not to the
> reverse table.  This means import/export is not reversible.  We should
> add entries to the other table too.

I did not put them in the reverse table. But that can be easily done.

Somehow I am feeling a little too much hatred towards bibtex/latex
in zotero community. Please forgive me if I did not use more polite
words here or there because I am not a native english speaker.

Physical sciences all have long history for research and literature.
Not all of the documents are in modern or fancy formats. But so
much of the information is good enough and found its way getting
into my reference database and I am sure, also, a lot of other
scientists'. Are we asking all the scientific researchers who are
using zotero and bibtex to manually modify their databases to
conform "zotero's standard"? Those latex-specific
markups are scattered everywhere in these people's databases.
It would be a hack of job for individual users to convert those to
other markups appealing to office word users. And the info was
not created by the actual users but their ancestors. However,
zotero wants to punish these users because they include such
information in their databases. That is ironic even for the sake
of demoting the use of latex markups in zotero.

On the other hand, users of the bibtex translator probably use
latex editors more than office word. They do not need their
databases to look appealing to other guys. When they do
share their information, they share it most likely with people
of similar minds. Is there any reason that a office word user
would use the bibtex exporter? I guess not. So why don't
we just let the bibtex exporter do what is best for most of
the latex/bibtex users?

Somehow the spirit around zotero is not right. It should be
more for a cooperative, compatible, user-friendly one. It will
be really disappointing if it becomes a policing tool to
tell people what to use and what not.

Please forgive me if my words offend some people.

-spartan

> The right way to fix this is to address:
>  https://www.zotero.org/trac/ticket/439

> See:
>  http://forums.zotero.org/discussion/5324/bibtex-and-greek-characters/
> and many others for philosophical discussion.

thanks for the info. I'll take a look.

May 5 2011, 6:42 pm
Bruce D'Arcus <bdar...@gmail.com>
Thu, 5 May 2011 18:42:17 -0400
Local: Thurs, May 5 2011 6:42 pm
Subject: Re: Improving BibTeX export

On Thu, May 5, 2011 at 6:27 PM, spartan <spartan...@gmail.com> wrote:

...

> Somehow I am feeling a little too much hatred towards bibtex/latex
> in zotero community. Please forgive me if I did not use more polite
> words here or there because I am not a native english speaker.

I have no "hatred" towards that community.

I do get annoyed when people present it as the only significant game
in town though, and don't consider the wider world that Zotero is
attempting to be compatible with (a variety of other formats, the web,
word-processors, etc.).

> Physical sciences all have long history for research and literature.
> Not all of the documents are in modern or fancy formats. But so
> much of the information is good enough and found its way getting
> into my reference database and I am sure, also, a lot of other
> scientists'. Are we asking all the scientific researchers who are
> using zotero and bibtex to manually modify their databases to
> conform "zotero's standard"? Those latex-specific
> markups are scattered everywhere in these people's databases.

So use a BibTeX-dedicated solution?

I have no problem if, for example, Zotero were to add math support in
ways that would allow round-tripping to LaTeX. But that's a different
matter.

Rick has more direct experience working with BibTeX, though, as well
as understanding of Zotero, so I'd tend to defer to him on the
details.

[...]

> Somehow the spirit around zotero is not right. It should be
> more for a cooperative, compatible, user-friendly one. It will
> be really disappointing if it becomes a policing tool to
> tell people what to use and what not.

It really depends on where you stand: "cooperative, compatible,
user-friendly" for whom, after all? Just BibTeX users?

Bruce

May 5 2011, 6:51 pm
adamsmith <bst...@gmx.de>
Thu, 5 May 2011 15:51:44 -0700 (PDT)
Local: Thurs, May 5 2011 6:51 pm
Subject: Re: Improving BibTeX export
spartan,
there is certainly no hostility towards bibtex/latex: Rick, for
example, is a physicist and, I believe, authors mainly in LaTeX. I
can't weigh in on the details, but on a broader note, there is a very
strong commitment in the open source community to standards: Standards
facilitate data sharing, they facilitate interoperability, they
facilitate upgrades, they help dealing with unforseen new use cases
etc. There is a good reason that people strongly defend them - that
doesn't mean they don't want LaTex suitable solutions, they just don't
want them as ideosyncratic (quasi-) hacks.

On May 5, 6:27 pm, spartan <spartan...@gmail.com> wrote:

May 5 2011, 11:04 pm
spartan <spartan...@gmail.com>
Thu, 5 May 2011 20:04:06 -0700 (PDT)
Local: Thurs, May 5 2011 11:04 pm
Subject: Re: Improving BibTeX export
adam,

I am sorry if I used some harsh words. But frankly I don't see how the
current practice in zotero will help promote standard. Say, in a
reference item, $\alpha$ got embedded in the title. In the current
BibTeX.js, all the special characters get intentionally escaped. So in
the exported bibtex file, the title will include garbage like \$\\alpha \$
which can not be correctly treated in any bibtex/latex editor. It is
not useful for any word application either. If this piece of ill
formated
information got shared around, for example, imported back to another
zotero database, it will not get corrected. If it got exported again,
it
will become \\\$\\\\alpha\\\$. It could grow into a monster garbage if
such things go into cycle and cycle again. No sane importer/exporter
later can really correct this problem easily. I don't see how such a
piece of junk is useful for any other users (e.g., who use office
word).
So I view this practice as a pure punishment to whoever use this
piece of reference and the guys we shared it with.

On the contrary, things will be dramatically different if we do the
export in the correct way. Let's use the above example again. The
markup piece $\alpha$ should go untouched into the resulting
bibtex file for a sensible exporter. Any bibtex/latex applications
will deal with the file nicely. If later on the item got imported back
into the same or another database through user-reorganizing or
sharing with colleagues, the good things happen: the bibtex-
standard markup piece $\alpha$ will be converted into a correct
unicode character of the greek alpha. Suddenly everything
starts to follow zotero standard. No matter how many times it
got imported/exported either via bibtex or any other format, the
information will be correctly interpreted. Not only are the latex
users happy, but also are other users who have to import the
same item. See, gradually we can convert old non-standard
markups circulating in many databases into standard-obeying
new format. You can not achieve this using the former
punishing practice.

-spartan

On May 5, 6:51 pm, adamsmith <bst...@gmx.de> wrote:

May 5 2011, 11:21 pm
spartan <spartan...@gmail.com>
Thu, 5 May 2011 20:21:15 -0700 (PDT)
Local: Thurs, May 5 2011 11:21 pm
Subject: Re: Improving BibTeX export

> I do get annoyed when people present it as the only significant game
> in town though, and don't consider the wider world that Zotero is
> attempting to be compatible with (a variety of other formats, the web,
> word-processors, etc.).

I have no objections to the support for other formats. As I mentioned
in
the last post, the current practice does not make the existing
databases more compatible with other formats. So why punish
the latex users without real benefits for other formats?

> > Physical sciences all have long history for research and literature.
> > Not all of the documents are in modern or fancy formats. But so
> > much of the information is good enough and found its way getting
> > into my reference database and I am sure, also, a lot of other
> > scientists'. Are we asking all the scientific researchers who are
> > using zotero and bibtex to manually modify their databases to
> > conform "zotero's standard"? Those latex-specific
> > markups are scattered everywhere in these people's databases.

> So use a BibTeX-dedicated solution?

If there is such a open source solution that is as successful as
zotero
in collecting information, I seriously will go for it. I don't know
any.
That is why I love zotero but got frustrated with its bibtex support.
I just learned javascript myself and spent a lot of my spare time
implementing all these changes.

This really puzzles me. We are talking about modifying
BibTeX.js, aren't we? Why can't we make it follow bibtex
rules. It is not that I am asking for the changes in other format
translators to favor bibtex markups.

May 6 2011, 12:57 am
Avram Lyon <ajl...@gmail.com>
Fri, 6 May 2011 08:57:49 +0400
Local: Fri, May 6 2011 12:57 am
Subject: Re: Improving BibTeX export

On Fri, May 6, 2011 at 7:21 AM, spartan <spartan...@gmail.com> wrote:
> I have no objections to the support for other formats. As I mentioned
> in
> the last post, the current practice does not make the existing
> databases more compatible with other formats. So why punish
> the latex users without real benefits for other formats?
[..]
>> It really depends on where you stand: "cooperative, compatible,
>> user-friendly" for whom, after all? Just BibTeX users?

> This really puzzles me. We are talking about modifying
> BibTeX.js, aren't we? Why can't we make it follow bibtex
> rules. It is not that I am asking for the changes in other format
> translators to favor bibtex markups.

The concern is that decisions like handling of LaTeX's math-mode
delimiters or character entities have pretty major consequences for
data exchange. Upon import, $\alpha$ should definitely be converted to
the Greek letter in its Unicode representation. Similarly,
superscripted and subscripted characters and many other symbols have
Unicode representations, and we can and should make the proper
conversions to convert to them when importing BibTeX files. This is
particularly important because BibTeX is a common data interchange
format-- we have site translators like the Google Scholar translator
that rely on BibTeX, and many people import large quantities of items
using the format. These may be people who never actually use
BibTeX/LaTeX for bibliographies and documents, or they could be people
who intend to export back to BibTeX at a later date.

Most of the (La)TeX markup that occurs in bibliographic data is
limited to individual symbols -- these can be stored as Unicode within
Zotero, and imported and exported correctly. Full-fledged math support
is only rarely needed, particularly outside of abstracts. To enter the
individual symbols from within Zotero, the current system requires
that users enter them as Unicode-- probably by using the system
character palette. While this is probably not terribly convenient,
it's nothing that couldn't be streamlined by a small plugin to convert
(La)TeX names into the correct Unicode symbols.

If we set up a situation where the recommended workflow is to mark up
bibliographic data in (La)TeX ways, users who take this approach will
lose the full flexibility of Zotero and may no longer be able to use
their data in any non-BibTeX context. No export to RIS, (possibly) no
OpenURL locator lookups, no use of the server API, particularly its
formatted citations, no use of the local integration plugins, from
word processor plugins to emacs org-mode integration.

All that said, I do think we could do better in our handling of
math-mode upon export to BibTeX. The arguments above do not imply that
we actively do not support handling what the user clearly intends to
do-- if we can devise a handling for $...$ that respects user
intentions on export, and doesn't escape the lot of it, but still
doesn't risk biting users who just happen to use dollar signs a lot,
then I think we should put it into the BibTeX export function.

Regards,

Avram

May 6 2011, 1:58 am
Dan Stillman <dstill...@zotero.org>
Fri, 06 May 2011 01:58:18 -0400
Local: Fri, May 6 2011 1:58 am
Subject: Re: Improving BibTeX export
On 5/5/11 11:04 PM, spartan wrote:

> Say, in a reference item, $\alpha$ got embedded in the title.

The rest of your argument depends on this assumption, but $\alpha$ is
"garbage" to all non-LaTeX-using members of the Zotero ecosystem. It
shouldn't be there to begin with, and it's a bug if it is. Import
translators should be fixed to ensure that it doesn't end up in a Zotero
field. Batch editing should allow people to fix that data.

Our philosophy on this is pretty well documented by this point, in this
thread and elsewhere. You may feel that BibTeX-using Zotero users "do
not need their databases to look appealing to other guys", but that's
contrary to Zotero's goals. This isn't a "punitive" policy, and it's in
no way specific to BibTeX. Adding support to export translators for
format-specific markup in Zotero fields encourages the entry of bad data
in Zotero fields.

(And, really, it's a little silly to keep suggesting that properly
rendered characters are something that only MS Word users are interested
in.)

May 6 2011, 3:29 am
spartan <spartan...@gmail.com>
Fri, 6 May 2011 00:29:06 -0700 (PDT)
Local: Fri, May 6 2011 3:29 am
Subject: Re: Improving BibTeX export
Avram,

Thanks for your elaboration. But I felt that you may get my point
better if you read my reply to adam smith earlier.

Your  and other devs' concerns are that support of proper bibtex
markup export will promote the more convenient non-standard use
of bibtex markups. My point is that it is more likely the opposite.

At least for me, I use zotero because it is a really powerful tool
to gather reference data automatically from other places
especially the web. I seldom manually input anything in the
standard fields like title, abstract. I doubt people do manual input
in these fields a lot. People would go to other reference manager
if they enjoy doing these kind of manual input. The only field that
I do manual input and I imagine others probably do as well is the
note which should not affect the discussion here. The bibtex
markups appeared in many people's zotero databases are
most probably NOT coming from these individual end users. They
came from various web resources either current or in old or
ancient times which most likely have nothing to do with zotero.

So why punish the innocent end users for 'crimes' others commit?

If we do as I proposed, an export and then an import will
automatically convert non-standard databases back to conform
the unicode standard. The current practice will keep those
non-standard data for ever unless the end users suddenly
become diligent to manually correct them all.

-spartan

> The concern is that decisions like handling of LaTeX's math-mode
> delimiters or character entities have pretty major consequences for
> data exchange. Upon import, $\alpha$ should definitely be converted to
> the Greek letter in its Unicode representation. Similarly,
> superscripted and subscripted characters and many other symbols have
> Unicode representations, and we can and should make the proper
> conversions to convert to them when importing BibTeX files. This is
> particularly important because BibTeX is a common data interchange
> format-- we have site translators like the Google Scholar translator
> that rely on BibTeX, and many people import large quantities of items
> using the format. These may be people who never actually use
> BibTeX/LaTeX for bibliographies and documents, or they could be people
> who intend to export back to BibTeX at a later date.

if $\alpha$ was incorrectly exported into the bib file, how by import
you
can correct it to the proper greek letter? When these non-latex users
import the bib file, they'll get the junk as well into their
databases.

> Most of the (La)TeX markup that occurs in bibliographic data is
> limited to individual symbols -- these can be stored as Unicode within
> Zotero, and imported and exported correctly. Full-fledged math support
> is only rarely needed, particularly outside of abstracts. To enter the
> individual symbols from within Zotero, the current system requires
> that users enter them as Unicode-- probably by using the system
> character palette. While this is probably not terribly convenient,
> it's nothing that couldn't be streamlined by a small plugin to convert
> (La)TeX names into the correct Unicode symbols.

How big a chance people starts to enjoy manually input title, etc?
To me that is really defeating the purpose of using zotero.

> If we set up a situation where the recommended workflow is to mark up
> bibliographic data in (La)TeX ways, users who take this approach will
> lose the full flexibility of Zotero and may no longer be able to use
> their data in any non-BibTeX context. No export to RIS, (possibly) no
> OpenURL locator lookups, no use of the server API, particularly its
> formatted citations, no use of the local integration plugins, from
> word processor plugins to emacs org-mode integration.

My suggested approach will help convert non-standard data back to
standard one and ready for all the uses you mentioned. But currently
the non-standard data will stay there until manual corrections which
can be a huge task.

As I said, zotero users very unlikely input the title, abstract etc
manually. We don't use it to write new papers, do we? We use it
to gather published information from various places automatically.
People would abandon zotero long ago if they had to do such input.
On the other hand, I don't see zotero's intentional demotion of
bibtex will force all the websites to abandon latex markups.
I don't see either that a proper export of latex markups will make
all the websites favor latex markups.

Personally, I think it is also good for other translators to preserve
their own markups ( I have no knowledge of it) when they do
exporting. Just like the case if we translate an English book or
article that contain a little bit Russian phases to Russian, we
keep the original existing Russian phase as it is. It is hard to
imagine that we trash those Russian phases in translation due
to the non-standard in an English book. If the whole world starts
to speak in Russian because of that, then so be it.
(Just use Russian as an example, I have great respect for Russian)

May 6 2011, 3:49 am
spartan <spartan...@gmail.com>
Fri, 6 May 2011 00:49:51 -0700 (PDT)
Local: Fri, May 6 2011 3:49 am
Subject: Re: Improving BibTeX export

On May 6, 1:58 am, Dan Stillman <dstill...@zotero.org> wrote:

> On 5/5/11 11:04 PM, spartan wrote:

> > Say, in a reference item, $\alpha$ got embedded in the title.

> The rest of your argument depends on this assumption, but $\alpha$ is
> "garbage" to all non-LaTeX-using members of the Zotero ecosystem. It
> shouldn't be there to begin with, and it's a bug if it is. Import
> translators should be fixed to ensure that it doesn't end up in a Zotero
> field. Batch editing should allow people to fix that data.

These 'bugs' or 'garbage' are not likely generated by end users like
you
and me but from various websites in the histrory. You don't want
people
to throw their databases away because of bugs inside. A current
importer
can not fix things already in the databases. My suggested approach
actually can fix the bugs by an export and an import.

> Our philosophy on this is pretty well documented by this point, in this
> thread and elsewhere. You may feel that BibTeX-using Zotero users "do
> not need their databases to look appealing to other guys", but that's
> contrary to Zotero's goals. This isn't a "punitive" policy, and it's in
> no way specific to BibTeX. Adding support to export translators for
> format-specific markup in Zotero fields encourages the entry of bad data
> in Zotero fields.

If people are so keen of manually inputing data in title, why do they
use
zotero? Or are you saying that zotero's practice here really have big
impact on many websites? The only encouragement I can see is for
a casual individual user if he/she does do a lot of manual input.
Which
is really not a big deal since zotero is not targeted for such people.

May 6 2011, 5:28 am
Frank Bennett <biercena...@gmail.com>
Fri, 6 May 2011 02:28:12 -0700 (PDT)
Local: Fri, May 6 2011 5:28 am
Subject: Re: Improving BibTeX export
spartan,

It seems that everyone is in favor of normalized source, on both sides
of a Zotero/BibTeX (or any other) conversion. Your correspondents in
this discussion are only opposed to the implicit normalization of text
by the tools that move data across the Zotero/BibTeX divide.

Personally, I would have to agree, based on some recent personal
experiences with the Zotero citation formatter (of which I am the
perpetrator in chief). Users do all sorts of strange and wonderful
things in their data, and records that work swimmingly in one context
(say, when talking to BibTeX) can cause consternation and havoc in
others. So all Dan is suggesting is that data coming into Zotero
(including "historical" BibTeX records) should be normalized at the
entry gate, where (relatively) safe assumptions about the intention
behind various patterns in the text. The emphasis differs from your
own initial position, but the aim is the same, and there is a
carefully considered logic behind taking that approach. (It's
certainly not a strategy for demolishing anything or punishing anyone
-- that's outside the scope of the Zotero mission statement :)

Frank

May 6 2011, 10:50 am
Olivier Cailloux <olivier.caill...@ecp.fr>
Fri, 06 May 2011 16:50:28 +0200
Local: Fri, May 6 2011 10:50 am
Subject: Re: Improving BibTeX export
Hello,

Many thanks to spartan for his improvements to the bibtex exporter which I look forward to see integrated! (Naturally, after reaching a consensus in the discussion with the main zotero devs.) Thanks also for this thread that raises some interesting questions.

I am a zotero user and bibtex user, bibtex being my primary source (almost: only source) for actually using the zotero data when I write papers.

Although I would be very happy to see better bibtex support in zotero, I do agree with the zotero team that general support of bibtex-formatted entries in the zotero db such as $\alpha$ is not the way to go, at least on the long term. However, we do have a problem (as mentioned already by others) in that zotero does not accept maths in its fields. E.g. $\frac{a}{b}$. We can dream of a very good and general solution for maths support in zotero, but AFAIK that's not going to happen anytime soon. In the meantime, let us try to think about an acceptable solution for allowing minimal maths support in zotero.

My suggestion (out of the top of my head so may be stupid): use <latex> </latex> markups. On bibtex import, $..$ would be changed to <latex>..</latex> when there is no alternative (such as for $\frac{a}{b}$). On export, changed back. For better compatibility with other formats, bibtex import should preferably be transformed into pure Unicode than into <latex> code when possible (such as for $\alpha$).

I agree that these <latex> tags would currently be bibtex specific and it's not very satisfactory compared to the general zotero goals. However, there is no theoretical reason other zotero exporters could not support them, e.g. someone could write latex-to-html transform exporter for zotero (I guess such tools already exist for other softwares), latex-to-word, etc.

This proposal, although imperfect, seems to me more satisfactory than the current situation, where $\frac{a}{b}$ gets transformed into something that nobody is happy with, be them HTML, Word, or bibtex format users.

Just my 2 cents.
Olivier

Le 06/05/2011 11:28, Frank Bennett a écrit :
spartan,

It seems that everyone is in favor of normalized source, on both sides
of a Zotero/BibTeX (or any other) conversion. Your correspondents in
this discussion are only opposed to the implicit normalization of text
by the tools that move data across the Zotero/BibTeX divide.

Personally, I would have to agree, based on some recent personal
experiences with the Zotero citation formatter (of which I am the
perpetrator in chief). Users do all sorts of strange and wonderful
things in their data, and records that work swimmingly in one context
(say, when talking to BibTeX) can cause consternation and havoc in
others. So all Dan is suggesting is that data coming into Zotero
(including "historical" BibTeX records) should be normalized at the
entry gate, where (relatively) safe assumptions about the intention
behind various patterns in the text. The emphasis differs from your
own initial position, but the aim is the same, and there is a
carefully considered logic behind taking that approach. (It's
certainly not a strategy for demolishing anything or punishing anyone
-- that's outside the scope of the Zotero mission statement :)

Frank



May 6 2011, 10:51 am
spartan <spartan...@gmail.com>
Fri, 6 May 2011 07:51:47 -0700 (PDT)
Local: Fri, May 6 2011 10:51 am
Subject: Re: Improving BibTeX export
I could understand it if zotero's stand on this issue is based on
"Non-Doing". But it actually intentionally implemented a piece
of code to make the exported bib file even messier. The current
entrance normalization procedure (i.e. import) is far from
complete. So often, latex or semi-latex like markups easily
go inside our databases. Instead of export it as it is so we can
find better solution to normalized it through re-import later, the
current implementation makes it irrecoverable.

Just like, say, the US is too crowded now so the government
wants to promote the use of trains for transportation. It will be
a reasonable measure if they build better railroads and make
more trains, etc. But it will be ourageous if they start to dig
holes on the normal roads to prevent people from driving their
cars. Sadly such kind of policing behavior is what zotero
treats bibtex now.

On May 6, 5:28 am, Frank Bennett <biercena...@gmail.com> wrote:

