Morpheme breaks in Arabic script

27 views
Skip to first unread message

chas

unread,
May 15, 2017, 3:23:12 PM5/15/17
to FLEx list
I'm entering some colloquial Arabic text in FLEx in Arabic script.

Is there a way to teach FLEx that a vowel belonging to a given morpheme is actually marked as a diacritic on the final consonant of the previous morpheme?

I'm thinking not, since Writing System configuration doesn't permit me to add a diacritic without a base character, and I believe that's what I need to do to make things work they way they should.

Just in case someone knows how to make this work, here's the situation: 

A given word consists of two morphemes. It looks like this: CVC-VC. (The hyphen marks the morpheme boundary, which in this case is a also syllable break.)

Because the initial V of the second morpheme is a "short" vowel, it is marked in the orthography as a diacritic on the final C of the first morpheme. This means that FLEX sees my word this way: CVCV-C

If I could get FLEx to recognize that diacritics are sometimes vowels which belong to the following morpheme, I'd be a happy camper.

For now, I'm manually re-entering both morphemes when this behavior is present. I wish, though, that FLEx would let me insert these problematic morpheme breaks with my cursor. Is there a way to get FLEx to give me the option of treating a diacritic as a vowel which belongs to the following morpheme? 

Thanks. I found this question asked back in 2013, but I didn't see an answer. Maybe I missed it.
 

Beth-docs Bryson

unread,
May 15, 2017, 4:04:16 PM5/15/17
to flex...@googlegroups.com
Chas-

I think what you are asking is for the ability to put a morpheme break between a base character and the diacritic that is attached to it, is that right?

When you are on the Morphemes line in the text, if you choose the option “Edit Morpheme Breaks” then you get a little dialog that allows more control than just moving the cursor around in the focus box (it doesn’t spontaneously insert spaces when you type a hyphen, trying to guess which is an affix and which is a root).  However, it is still the case that when you use the arrow keys, it moves over a complete "base character plus diacritic" combination.

However, there is a control sequence that allows you to change the behavior of the cursor.  I think if you search for “arrow by character” or something similar to that in the Helps, you will find a topic that refers to this.  (It is useful for Devanagari as well.)  If you type the special sequence, then it changes the logic so that when you use the left and right arrow keys to move the cursor, it moves by the actual underlying characters, not the visual character.  So it is possible to end up with the cursor *between* a base character and its diacritic.  (There is no visual representation of this—it just looks like the cursor stopped instead of advancing.  But if you count carefully and trust that it’s doing something you can’t see, then you can get it there.)  Once you get the cursor in this position, then you can type a space, and it will separate them.  The diacritic will likely appear with a dotted circle under it.

So that is part of the issue—getting it divided in the text area.  But you also need this morpheme in the lexicon, in order for it to do anything meaningful.  Have you tried entering the affix as a lexeme form that starts with a diacritic?  It should accept it. 

So if you have your -VC affix in the lexicon, and then you divide the word in the text into   CVC -VC, then it should match that surface form of -VC with the entry for the affix that is in the lexicon.

(I am just talking about manual interlinearizing—this doesn’t require any parser being used.  It would work with the parsers also, but sometimes it helps to make sure you can do it manually, before you try to do it with one of the parsers.)

-Beth


--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
To change your status, please write to flex_d...@sil.org.
You can join this group by going to http://groups.google.com/group/flex-list.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/ffa73892-bd74-478c-8092-b936dcd22ce3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

chas

unread,
May 15, 2017, 4:54:37 PM5/15/17
to FLEx list
Bless you, Beth Bryson! The Help screen you referred to sent me to a batch file included with Field Works. Running the batch file tweaked the Windows registry so that I'm getting exactly the behavior I wanted. 

Yes, you're right that what I was asking for was the ability to put a morpheme break between a base character and its attached diacritic. And now it seems I have that ability.

For the record (and so I can find this when I forget how to do this), the Help screen in question is "About Font Features", under Advanced Tasks / Writing Systems / Modifying a Writing System. The instructions are in the paragraph (marked with a bullet point) that begins "For fonts other than Charis SIL or Doulos SIL that have diacritics but for which the Font Features button or Diacritic Selection is not available..."

I'll test drive it a bit and let you know.

Thanks again! I was so happy to see FLEx's otherwise robust support of Arabic script that I was disappointed when I encountered this morpheme-break problem. But it seem you folks anticipated the problem. 

-Charlie L.

chas

unread,
Jun 2, 2017, 6:26:08 AM6/2/17
to FLEx list
Beth B. directed me to solution for the need to insert a morpheme break between a diacritic and its base character. It's working well. All I had to do was run a batch file that made a change in the Windows registry.

Now I'm wanting to get the same behavior under FLEx 8.3.8 for Linux. The relevant FLEx help page mentions a file "usr/share/fieldworks/Arrow by character.reg", which is indeed present on my system. 

My question: How do I use this file?

Since this is Linux and not Windows, I'm not clear on how and where to make the needed change.

Thank you very much.

-Charlie L.

Marlon Hovland

unread,
Jun 2, 2017, 12:40:42 PM6/2/17
to flex...@googlegroups.com

Hello,

 

On native Xenial, I set the ~/.mono/registry/CurrentUser/Software/sil/fieldworks/values.xml file to True, but it still did not work the same as Windows. It appears that this functionality is not implemented in Linux.

 

A Jira issue has been added.

 

Marlon

chas

unread,
Jun 2, 2017, 1:02:13 PM6/2/17
to FLEx list
@Marlon: 

Thank you for your reply. You helped me to see where the change would be made, if it were possible to make it. But in the several values.xml file for language explorer I didn't see a value name that would correspond to "ArrowByCharacter".

Thanks for entering the issue. I can use a Windows machine for now, but I was wanting to move my FLEx (and Paratext, too) work to Linux so that I could avoid the Windows Creators issues that are causing problems.

-Charlie L.
Reply all
Reply to author
Forward
0 new messages