Something like this would be the process.
Jonathan
--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+unsubscribe@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/34befdc8-c6c2-4b11-b1fc-6a6ff061c60f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Thapelo,
I would suggest entering all of the affixes and verb extensions in the lexicon then click on the components field for the lexeme and populate it with all the morphemes that make up the word. Below is a sample from a Bantu language. I you look at the components line it lists what the morphemes are and they are listed within the parenthesis in the dictionary entry at the top.
Jeff Shrum
SIL International
Language Technology Consultant
Dallas, TX, USA
--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
I have a Setswana corpus of about 20 million tokens. This makes it possible for me to generate a frequency list which can also be sorted alphabetically. It also makes it possible for me to study language as used by speakers and not what is potentially possible based on morphological rules. I am aware that a common lexicographic approach is to enter only the canonical form of verbs in dictionaries and this is perfectly understable. This is the approach we adopted in producing the largest monolingual Setswana dictionary: "Tlhalosi ya Medi ya Setswana". Our analysis of Setswana verbs is that we dont have more than 5000 in the whole dictionary. Not all of the verbs can take all Setswana suffixes and their various combinations. There are verbs which attract many suffixes and there are those which attract a few. I am yet to see any that generates a hundred wordforms. In my earlier email I gave an example of the highly productive TSAMAYA.
For nouns, we mainly add the plural prefix to mark plurality and the -ng to change the noun into an adverb.
E.g.
Kgosi (chief)
Dikgosi (plural)
Kgosing (adverb)
The principal desire is to provide users/learners with the analysis of wordforms in the dictionary. I am largely not interested in indicating the suffixes that may attach to a canonical verb since they attach in various complex ways. The verbs would be the most challenging ones to deal with and I am wondering if everyone who has had to write a Bantu language dictionary have entered the canonical verbs only.
Thapelo
I take it that your suggested approach doesn't enter all the wordforms but only the basic headword with the possible suffixes.
Thapelo,
Not necessarily. If you want all of the fully inflected forms as head words you can do that. If you have the list of surface forms there are various ways to import them into you database. If you have the Setswana words with their glosses in a Standard Format file you can import a word list. If you have just the list of Setswana words then pasting the list in as a new text in the Text & Words area would be the simplest way to enter the words.
Then click on the gloss tab and begin glossing the words. To get each surface form added as a head word to you database, go to the Tools menu and check the “Add words to the lexicon” option.
I might suggest that you create a new project to do this in case you are not happy with the results. When you have it working as you expect, the project can be merged with you existing project to join the two.
Jeff Shrum
SIL International
Language Technology Consultant
Dallas, TX, USA
-----Original Message-----
From: flex...@googlegroups.com [mailto:flex...@googlegroups.com] On Behalf Of Otlogetswe
Sent: Thursday, August 25, 2016 10:01 PM
To: FLEx list <flex...@googlegroups.com>
--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/14ad51e1-d868-483c-bdf2-91f1ed24fa77%40googlegroups.com.