Zotero needs to normalize its data to a consistent Unicode
normalization, preferably NFC. I've detected the problem by doing
this:
1. Exported bibliography from Refworks to Bibtex format.
2. Visually inspected the resulting file. Accents were consistently
represented everywhere (as they would after an NFC normalization).
3. Imported the Bibtex file into Zotero. Everything appeared normal
after inspecting records.
4. Imported an entry from the Library of Congress into Zotero.
5. Exported all entries to a Bibtex file in Unicode format. (See my
post here:
http://groups.google.com/group/zotero-dev/browse_frm/thread/61a52b5012bccd3c
)
6. Upon inspection of the exported file, I see that all entries are
exported properly except for the entry I imported from the Library of
Congress. In the entry from Library of Congress, all accented
characters appear in their Unicode decomposition.
Deductions:
1. Given that all entries except for the LoC entry are consistently
and correctly coded, the problem is neither in the Bibtex import nor
in the Bibtex export.
2. Since the only faulty entry is the LoC entry, it is likely that the
LoC encodes accented characters in a decomposed fashion or that the
import filter used to import entries from the LoC creates decomposed
Unicode data which is passed to the rest of Zotero.
Suggested fix: After a filter is used to import data into Zotero, the
core Zotero code should perform a final pass on the data to normalize
it to NFC.