I think it's important to keep in mind a few meta points on thinking
about citation conversion:
First, there's not a single version of the Word format. There's the doc
binary format, and there's now OOXML, which is going through a battle of
epic proportions to become an ISO standard. It's still unclear whether
it will succeed or not. But even if it fails, I expect MS will be
promoting it for quite awhile, and that from their perspective .doc is
effectively deprecated.
I mention this because OOXML has a dedicated citation field.
So even within just this small context, we have in fact three kinds of
fields that all do the same thing: Endnote's custom field, Zotero's
custom field, and OOXML's standardized citation field.
Rintze Zelle wrote:
> To my mind there are two scenarios for users who want to switch from
> having Endnote fields in their existing Word documents to having
> Zotero fields:
> 1) The user still has access to Endnote, or at least to the Endnote
> library in a format Zotero can read.
> 2) The user has no longer access to his Endnote library.
... or "her"? ;-)
Moving on ...
Be careful about narrowing the use cases too much. For example, what
about a Zotero user who collaborates on a document with an Endnote user?
> In the first case, the user can import the references from Endnote
> into Zotero. The converter tool could then check if it can find a
> match for each Endnote field in the Word document and the references
> in the Zotero library. If a (sufficiently good) match is found the
> Endnote field can be automatically replaced by a Zotero field (or at
> least a replacement can be suggested to the user).
>
> In the second case, there are two options:
> a) the Endnote fields in the Word document are scraped for citation
> data. Based on the collected data, an item is created in the Zotero
> library and the Endnote field is replaced by a Zotero field based on
> the new Zotero item. Whether this is a viable option probably depends
> on how rich the Endnote fields are, and how easily information can be
> extracted from them.
My understand is they are a) rich enough, and b) not hard to extract the
data from.
> b) the Word document is searched for Endnote fields. For each field, a
> window is opened allowing the user to select the corresponding item in
> the Zotero database. This is the approach suggested by Simon, and
> requires the least amount of intelligence on the converters side (as
> it doesn't perform any matching). The user however has to make sure
> that he has imported the required citations into Zotero before
> starting with the conversion.
What happens if you have a book manuscript with, say, 1000 citations to
300 unique sources?
> I guess that Simon's ticket should be handled first, presumably as a
> Word plugin separate from the current one. If that is in place,
> perhaps the converter tool could be extended to also automatically
> match Endnote fields to an imported library (for case 1). The latter
> would be especially helpful for users with a lot of documents (such as
> seems the case for olaf in discussion 1889).
>
> It is probably not necessary to extract formatting data from the Word
> document (e.g. what style has being used). The user should be able to
> select the correct bibliography style rather easily once the
> conversion to Zotero fields has been completed.
>
> One thing that would be helpful would be to have some Word documents
> available with Endnote references in them, so one can test the tool on
> them. Perhaps we could post a request for some in the forum
> discussions mentioned above.
Good idea. It might also be useful to collect some samples with OOXML
citations so you at least know how they differ.
Finally, just a reminder that the citation fields still contain local
database IDs, which is a rather gaping flaw with practical consequences.
So while you work on interop between Zotero and Word in .doc, it'd be
nice if we could see some movement on interop of an even more basic
sort: that between different Zotero installations (including different
users).
This of course goes back to previous discussions of global IDs (URIs and
such).
Oh, and don't forget OpenOffice ;-)
Bruce
I raise the use case independent of any implementation concerns
precisely so the actual implementation doesn't foreclose the
possibility of solving these sorts of problems without thought.
Perhaps if someone had raised this use case a year or two ago I'd be
able to format my documents on different machines ;-)
...
> > Finally, just a reminder that the citation fields still contain local
> > database IDs, which is a rather gaping flaw with practical consequences.
>
> Yes, but were probably still far of from a situation were both Endnote
> and Zotero include URI's in the reference fields of Word documents.
But that doesn't mean Zotero can't move to and exploit them. There is
a path forward to solving these problems.
Bruce
> I guess there are even more than that if Reference Manager and Procite
> use different field layouts. By the way, Endnote doesn't seem to use
> the citation support in OOXML if it's available (I checked this in
> Word 2007).
Yes, now you see the insanity of this! It's completely dysfunctional! I
can't collaborate with people using Endnote or Reference Manager. Hell,
I can't even collaborate with other Zotero users!
Having standard ways to encode citations is really important to clean up
this mess in my view. Just because Endnote doesn't use the new OOXML
citation field doesn't mean they can't, or they shouldn't. It's not in
their interest to use the standard field; they benefit from the
fragmented market in which they (currently) dominate.
The question WRT to Word 2007 and 2008 is whether it's possible to use
the default citation field and API, but bypass the rather weak citation
formatter.
I believe John McCaskey was looking into this. The only thing I recall
about it is he was looking to re-implement processing the citation
processing within Word.
Bruce