hyphens in analysis

35 views
Skip to first unread message

Brian O'Herin

unread,
Oct 25, 2018, 2:15:05 PM10/25/18
to flex...@googlegroups.com

We have lots of prefixes, which are normally written as one word with the root they attach to. However, if they attach to a proper name (or to a noun that is identified with God and thus capitalized), there is an orthographic convention that uses a hyphen: e.g. “pre-God” This messes up my flex analysis since it treats the two things on either side of the hyphen as separate words, giving me an unattached prefix and a (possibly obligatorily bound) root without its prefix.

 

Is there a way to make FLEx treat these as a single word, short of just removing the hyphens?

 

Brian

Craig

unread,
Oct 25, 2018, 4:58:39 PM10/25/18
to flex...@googlegroups.com, Brian O'Herin
Search for "word-forming hyphen" in the help for instructions on configuring this.

Craig.
--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
To change your status, please write to flex_d...@sil.org.
You can join this group by going to http://groups.google.com/group/flex-list.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/8995fffe8fc22b8fdf2c8854b79f83c3%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Jonathan Dailey

unread,
Oct 26, 2018, 9:59:32 AM10/26/18
to FLEx List
The irony of the search terms is amazing.


For more options, visit https://groups.google.com/d/optout.


--
Jonathan Dailey
SIL International
Language Technology Consultant

Leaders, Marlin

unread,
Oct 26, 2018, 1:23:24 PM10/26/18
to flex...@googlegroups.com

Brian, there are a number of different hyphens in Unicode. The Hyphen-Minus sign is - (number #002D) typically used is what I think is causing your problem. If you want to treat all hyphen-minuses as word-forming you could  do that.

Or, you can use a Non-Breaking hyphen instead. It is ‑  (code #2011). I think you can do a Search and Replace or Bulk-Edit to swap out the Hyphen-Minus (#002D) for the Non-Breaking hyphen (#2011).

 

Another hyphen nearby is Number ‐ (#2010). It is a Punctuation Hyphen. If I understand it correctly, I believe that would be a worse problem for you. I think it would split words up like a period would in an interlinear text.

 

Each of the 3 hyphens I’ve inserted here are their different Unicode points. If you select any one hyphen in a Microsoft product and then press Alt with x, it will convert to their hexadecimal number. And conversely you could type the number (leading zeroes can be omitted), select it, then type Alt x to get the hyphen character. They all look the same to me, but are different characters and do different things in FLEx interlinear texts, as I understand it.

Hope this helps,

Marlin

Hugh Paterson

unread,
Oct 26, 2018, 1:33:15 PM10/26/18
to flex...@googlegroups.com
As of yet un-mentioned is the U+02D7 ˗ MODIFIER LETTER MINUS SIGN Which acts as a letter. I'v not seen it in use in Bantu/bantoid languages, but I have in Mande languages, where it is a letter, does not cause word breaks, is included with the word when you double click the word, and is – in the ivory coast – used as a tone mark.

all the best,


For more options, visit https://groups.google.com/d/optout.


--
Hugh Paterson III Innovation Analyst
Innovation Development & Experimentation, SIL International
 


Courtney Smith

unread,
Feb 12, 2019, 12:17:42 AM2/12/19
to flex...@googlegroups.com
Welcome to Google!

Hi

thanks for join us!

Reg Google
Reply all
Reply to author
Forward
0 new messages