[Copied from zotero forums post]
There
are multiple ways to get the selected text in a paper, however some
return text that is different to the original, not visually it seems,
but it will mess with the unicode encoding of the text (which is
meaningful for TTS, or any downstream code doing some form of text
matching etc)
The user provided this PDF for texting
https://github.com/user-attachments/files/16676094/A.pdfIf you select the Å character,
reader._iframeWindow.getSelection().getRangeAt(0).toString()
returns the correct U+00C5 (LATIN CAPITAL LETTER A WITH RING ABOVE) character
However,
reader._lastView._selectionPopup.annotation.text
returns two characters, U+0041 (LATIN CAPITAL LETTER A) and U+030A (COMBINING RING ABOVE)
(reader is just shorthand for the internal reader of the currently selected reader tab)
I
haven't done any digging into why this occurs, nor have I checked what
other ways there are of getting text out from Zotero to see if it occurs
elsewhere (.getDisplayTitle() on library items and .text and .comment
fields on sidebar annotations in the reader are all safe since the
original user reported them as being fine.)