Zach,
I think you have correctly organized your data. I used inflectional variants, and plural variants in a Nilo-Saharan language with more permutations than you are showing. If you go ahead and assign inflectional features (nominative, genitive, dative, accusative) to the affixes I think you will be able to get the morphological parser to work.
I know Portuguese and have done some work in Portuguese, but not too much. I will say that for Portuguese irregular verbs, and the subjunctive form of most verbs should be handled with stem names. I am assuming that you will have something like the following example from Portuguese that I used in workshop:
In Portuguese the present subjunctive verb is formed by taking the first person singular indicative and removing the final vowel, then inflecting that stem for person and number. See the paradigm for the verb /ver/ “to see” below.
VER (to See) | ||
| Indicative | Subjunctive |
| Present | Present |
I | vejo | veja |
you (sg) | vês | vejas |
he/she | vê | veja |
we | vemos | vejamos |
you (pl) | vedes | vejais |
they | veêm | vejam |
In order for the morphological parser to parse the present subjunctive, we need to define a stem name for the present subjunctive and use it as the stem from which all present subjunctive verbs are built. We will then add this stem as an allomorph to the lexeme /ver/. Also we will define a template for present subjunctive verbs and populate it with the present subjunctive person.number suffixes.
I am sorry that I do not have a Flex project where I have modeled Portuguese, but I am certain that I could do it. I think FLEx has everything needed. If you have other specific questions about something that is not working for you, feel free to write back. I do have examples written up for the Nilo-Saharan language and English that I could share with you if you are interested, but the examples might be a stretch to see how they apply to your situation.
Jeff Shrum
Language Technology Consultant
Dallas, TX
--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/CAMvzDN%2BS4GbE3p4xZiAqR-h7RaujAmKkkUcK3-WdfwZCNvGvFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Zach, I think I can add more about the follow question where you asked:
What is the difference between setting up inflection features with inflection slots for nominal/verbal/adjectival suffixes as opposed to giving each individual suffix a separate entry in the lexicon (which could be hidden during publishing)? Will this affect the parser? Is setting up inflection features & templates easier in the long run?
If you are going to use the morphological parser then you will need to use grammar templates. I would set up a template for each word class, each template will have a slot for “case-suffix” or whatever label you like. I would assign each suffix to the slots in which it can occur, then assign the inflectional feature of nominative, genitive, dative, and accusative for each one.
You show some suffixes that are homophonous. I saw several that could be either nominative or dative for instance. (this is very common in Greek paradigms.) I would enter the nominative and dative morphemes separately in the database. Some would disagree with me about “splitting” these morphemes apart, but this approach will be the best for making the parser work. If the parser suggests too many incorrect parses, you can use “ad hoc” rules to prohibit a feminine suffix from being proposed for a masculine noun stem or the like.
As you know infected forms are not usually included in a dictionary, though sometime irregular forms are included depending on the purpose of the dictionary. Sometimes dictionaries will have a basic grammar in the front to aid users to find words in the dictionary. For example the irregular Portuguese verb “ver” has the first person present indicative of “vejo”. A tourist who hears “vejo” would not be able to find definition of the word unless she knew Portuguese grammar and that there is an inflectional class of verbs that end in “-jo” instead of the much more common “-o”. Have you decided the purpose of the dictionary and how many forms and affixes you want with each head word? Have you considered including a grammar section with the most productive paradigms listed in the front matter and only having those forms that are not predictable as subentries under the headword?
Jeff Shrum
--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/000001d12283%24ed99b9f0%24c8cd2dd0%24%40sil.org.
Zach,
Since you are interested in my workshop materials I have begun putting them on our wiki site: Lingtran.net. I have so far uploaded PDF’s of the sessions that I think you would be most interested in. There is a module on stem names but it needs more work. I do not have all of the data uploaded yet so you will not be able to actually do the exercises, but please have a look. If these help you as is, great. If you need some of the data to actually try the exercises let me know and I will double check all of the data and upload it as soon as I am able. The link to the wiki site is:
http://lingtran.net/FLEx+8+Morphological+Parser
Jeff Shrum
SIL