Composed vs Decomposed Characters

25 views
Skip to first unread message

LSS PNG Office

unread,
Oct 17, 2017, 3:01:29 AM10/17/17
to FLEx list
I am trying to help a user standardize their data across Paratext and FieldWorks.  Currently, they seem to have both composed and decomposed characters in both.

Paratext has the "Convert Project" option where you can choose "Composed".  I don't see such an option in FieldWorks.  Is there an option?  I'm concerned about doing a Find/Replace: will that mess up the word mapping and Interlinear data?

Of note: the Keyman keyboard I created for him types composed characters and I confirmed this in Paratext (not converted yet).  But when I use the same keyboard in FieldWorks, it seems that FieldWorks automatically converts them to decomposed in the vernacular field.  Is that expected?

Part of my question also related to Finding a word from Paratext.  If you copy a word with composed characters in Paratext and paste it into the "Find" field in FieldWorks, will it find the same word with decomposed characters?  What about the "Find in Dictionary" option in Paratext?

Thanks for helping me understand and sort this out,

James

Ken Zook

unread,
Nov 29, 2023, 8:30:30 PM11/29/23
to FLEx list

Flex always saves data to external files, including fwdata, and the clipboard as NFC (composed) but internally everything is converted to NFD (decomposed). If a person types NFC, pastes NFC, or imports NFC, we still convert it to NFD internally, but then save it out as NFC. Flex does not offer any other options. So when working with Paratext, Paratext should be set to use NFC and everything should work OK between the two. On the question of Interlinear text, it’s always NFD inside of Flex, but I assume in PT everything should be NFC.

The main issues you might run into is if you are using bulk edit options in Flex, your converter needs to handle NFD data. If you are using a Keyman keyboard where context is important, then the Keyman keyboard must handle NFD.

 Ken

Reply all
Reply to author
Forward
0 new messages