Mapping two \lx's

21 views
Skip to first unread message

eel...@gmail.com

unread,
Oct 13, 2025, 5:14:57 AMOct 13
to FLEx list
Hi,

I've created a FLEx project where I use two writing systems for the Lexeme form (Wancho and IPA). How do I map this correctly when importing an SFM? I've labeled both the Wancho and the IPA column with \lx but that obviously doesn't work.

Eline

Anita Beniston

unread,
Oct 13, 2025, 5:48:58 AMOct 13
to flex...@googlegroups.com
Create a dummy entry in flex and export that entry in sfm format. 

You'll have the required markers and accordingly match the sfm file similar to this entry.  

--
"FLEx list" messages are public. Only members can post.
flex_d...@sil.org
http://groups.google.com/group/flex-list.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/flex-list/7be954c0-7175-4e0c-b3c5-6380964f220an%40googlegroups.com.

Andreas_Joswig

unread,
Oct 13, 2025, 6:41:21 AMOct 13
to flex...@googlegroups.com
Hi Eline,
I'm not sure I understand your case correctly. Right now I understand that you
  • have a sfm-file with only one line for the lexeme form (\lx )
  • want to have that information in a FLEx database in the two writing systems for the lexeme line
  • want to change one of them - supposedly the IPA writing system so that the orthographic data is adjusted to IPA characters

If this is the correct understanding, I would export the \lx field only once, to the lexeme field of FLEx in the Wancho writing system. Then you can bulk-copy each lexeme form into the other writing system, so that it contains the same information. You would then want to replace the writing system formatting to the new (IPA) system, using the following bulk-replace dialogue (click on more to see the format button):

Then you could use other bulk-replace operations to change individual characters to the IPA equivalents.

Let me know if this is unclear! Warm greetings,
Andreas

Kevin Warfel

unread,
Oct 13, 2025, 7:22:10 AMOct 13
to flex...@googlegroups.com
Eline,

I'm understanding your situation a bit differently than Andreas expressed. Hopefully one of our responses will give you the information you need.

I'm understanding you to have two separate columns in a spreadsheet, one with the lexeme form in the Wancho orthography and the other with the lexeme form in IPA form, but you are wanting to import both into the Lexeme Form field in FLEx (just in different Writing Systems). If this is your reality, I would use \lx for the Wancho form and \lx-IPA for the IPA form. Then for the import, map \lx to the Wancho Writing System (WS) of the Lexeme Form field and \lx-IPA to the IPA WS of the Lexeme Form field. If you need help with the mapping to a specific WS in the Lexeme Form field, ask for more details. I would need to dig a bit to provide those details just now, so I'll send this for now.

Best wishes,
Kevin

--

eel...@gmail.com

unread,
Oct 13, 2025, 7:37:23 AMOct 13
to FLEx list
Hi all,

Kevin, you understood my problem correctly. With Anita's tip I got quite far. All fields are imported now. The only remaining problem is that the Wancho script doesn't display correctly (only boxes). I specified the font I use in FLEx (Noto Sans Wancho), and the importer itself indicates there should be a Windows 1252<>Unicode conversion. I tried without the conversion but got a lot of error messages then

Screenshot 2025-10-13 133414.png
Screenshot 2025-10-13 133609.png
Do you see any obvious errors?

Eline
Op maandag 13 oktober 2025 om 13:22:10 UTC+2 schreef Kevin Warfel:

Kevin Warfel

unread,
Oct 13, 2025, 7:51:02 AMOct 13
to flex...@googlegroups.com
If your Wancho characters are already in a Unicode font, you shouldn't need a converter. I would have expected the "<Already in Unicode>" option to work (rather than "Windows1252<>Unicode"). Did you try that and it didn't work? Or what was your rationale for using a converter?

Kevin

kevin_...@sil.org

unread,
Oct 13, 2025, 9:18:13 AMOct 13
to flex...@googlegroups.com

Apologies, Eline. I see that you wrote that you did try to import with no converter but got a lot of error messages. I missed that part of your message when I replied earlier.

 

I have no more advice for you, but I’m sure there are others who have relevant knowledge that I’m lacking and will be happy to share it for your benefit. (And I’ll learn something as well.)

 

Kevin

image001.png
image002.png

Beth-docs Bryson

unread,
Oct 13, 2025, 11:03:29 AMOct 13
to flex...@googlegroups.com
I know that I have seen some Jira issues that might be related to this.  Please write to FLEx_...@sil.org; I expect there will be some answers from them.

-Beth


David Rowe

unread,
Oct 13, 2025, 11:53:57 AMOct 13
to flex...@googlegroups.com
Eline,

Is it possible to create a file with a few of your SFM records and post it here? I'd like to look at how your Wancho text is encoded.

Thanks,
David
Message has been deleted

eel...@gmail.com

unread,
Oct 21, 2025, 9:33:26 AMOct 21
to FLEx list
Hi David,

Attached the file I tried to import.

Eline



Op maandag 13 oktober 2025 om 17:53:57 UTC+2 schreef David Rowe:
sheetswipetest.xlsx

David Rowe

unread,
Oct 21, 2025, 3:06:17 PM (14 days ago) Oct 21
to flex...@googlegroups.com
Eline,

Thanks for the excellent test data.

The Wancho text in the file is already Unicode, so there should be no need to use any encoding converter when importing that field.

Based on the five records you list, it seems that the \lx and \lx_Wan fields have the same data, so you'll likely only want to import one of those fields.
The \lx_ipa has the IPA equivalent, the \g_Eng has the English gloss, the \ps_Eng has the part of speech. 

I made an SFM test file from your spreadsheet, dropping the \lx_Wan field. (I assume you created something similar with sheetswiper.) Attached is the Wancho.txt file I used.
The lexeme (𞋙𞋖) in the first entry has the characters U+1E2D9 U+1E2D6 correctly encoded in UTF-8 as F0 9E 8B 99 F0 9E 8B 96.
It's been some time since I imported data into Flex, but there are others on this list who can correct any mistakes I've made. I mapped 
  • \lx to Lexeme Form
  • \lx_ipa to Pronunciation Form
  • \g_Eng to Gloss
  • \ps_Eng to Part of Speech
When I got to the Readiness step in the import, I got the following report (truncated to the first line) 
Error in SFM file at line 1: SFM 'lx' contains character value 0xD838, which is invalid and has been removed.
Error in SFM file at line 1: SFM 'lx' contains character value 0xDED9, which is invalid and has been removed.
Error in SFM file at line 1: SFM 'lx' contains character value 0xD838, which is invalid and has been removed.
Error in SFM file at line 1: SFM 'lx' contains character value 0xDED6, which is invalid and has been removed.

The characters U+1E2D9 U+1E2D6 in the \lx field on line 1 would be encoded as D838 DED9 D838 DED6 in UTF-16. 

It's not clear to me why Flex is discarding the data. I thought that perhaps the vernacular writing system needed to specify the characters, but trying to give the first and last Unicode character in the Wancho block didn't work:


Trying to add the Unicode values of the Wancho characters in the first lexeme didn't work either:


How can we get Flex to accept Wancho characters?

Thanks,
David
Wancho.txt

eel...@gmail.com

unread,
Oct 21, 2025, 11:57:46 PM (14 days ago) Oct 21
to FLEx list
Hi David,

Thanks for trying so elaborately! It's extra strange that the Wancho characters can't be imported because I can manually type Wancho without problems in FLEx. Exports also look good. I really hope someone understands the issue.

(I had lx and lx_Wan fields because that's what the sfm-export from flex had too.)

Eline



Op woensdag 22 oktober 2025 om 00:36:17 UTC+5:30 schreef David Rowe:
Reply all
Reply to author
Forward
0 new messages