Romance person/gender/number/case paradigms in FLEx

Zach Wellstood

unread,

Nov 18, 2015, 6:45:59 PM11/18/15

to flex...@googlegroups.com

Dear all,

Would anyone happen to have experience building a FLEx lexicon for a Romance language or a language which has extensive person/gender/number/case paradigms paradigms? We are organizing a project on a Romance language (Istro-Romanian) and it has proven quite complicated given FLEx's structure.

For instance, some possessive adjectival/pronominal forms have 8 forms in just the singular number.

1st person singular possessive adjective/pronoun "my/mine":

a mev/a me: sg. masc. nom./acc. possessum

a melvę: sg. masc. gen./dat.

a mę/a ma/a må: sg. fem. nom./acc.

a melję: sg. fem. gen./dat.

a mevo: sg. neut.

a melj: pl. masc. nom./acc.

a męle: pl. fem. nom./acc.

a melorę: pl. masc./fem. gen./dat.

Right now, the lexicon is organized such that the singular masculine possessive is the main entry to which all other forms of the paradigm are linked as "Inflected Variants." The entry ends up looking like this:

a mev (a me) (Ž) 1.sg.m.poss.nom/acc (a mę: 1.sg.f.poss.nom/acc, a męle: 1.pl.f.poss.nom/acc, a melvę: 1.sg.m.poss.gen/dat, a melorę: 1.pl.m/f.poss.gen/dat, a mevo: 1.sg.n.poss.nom/acc, a melj: 1.pl.m.poss.nom/acc, a melję: 1.sg.f.poss.gen/dat) poss. adj my, mine moj. čé-j a melvę́ óm? What's wrong with my husband?

a mevo (Ž) 1.sg.n.poss.nom/acc infl. of a mev (a me).

...etc.

Can anyone think of any drawbacks to this method of creating a paradigm?

Will FLEx offer the ability to create large paradigmatic tables for pronouns, demonstratives, etc. in the future?

Is it possible to change the order in which variants are displayed in the main entry? We originally ordered the Inflected Variants of a mev in a certain way but FLEx changed it arbitrarily after we reopened the project.

Further issues arise when parsing morphology:

What is the difference between setting up inflection features with inflection slots for nominal/verbal/adjectival suffixes as opposed to giving each individual suffix a separate entry in the lexicon (which could be hidden during publishing)? Will this affect the parser? Is setting up inflection features & templates easier in the long run?

Finally, if there happens to be anyone in the group who has worked on/created a FLEx database for a (Romance) language similar to the one presented here: would you be willing to exchange notes or share the database with us (in confidence) so that we could see a real example of how this work could be done? We have read several FLEx guides available online, but nothing beats a working example.

Many thanks in advance for any help.

Jeff Shrum

unread,

Nov 18, 2015, 11:04:47 PM11/18/15

to flex...@googlegroups.com

Zach,

I think you have correctly organized your data. I used inflectional variants, and plural variants in a Nilo-Saharan language with more permutations than you are showing. If you go ahead and assign inflectional features (nominative, genitive, dative, accusative) to the affixes I think you will be able to get the morphological parser to work.

I know Portuguese and have done some work in Portuguese, but not too much. I will say that for Portuguese irregular verbs, and the subjunctive form of most verbs should be handled with stem names. I am assuming that you will have something like the following example from Portuguese that I used in workshop:

Example from Portuguese—Present tense, subjunctive mood

In Portuguese the present subjunctive verb is formed by taking the first person singular indicative and removing the final vowel, then inflecting that stem for person and number. See the paradigm for the verb /ver/ “to see” below.

VER (to See)
	Indicative	Subjunctive
	Present	Present
I	vejo	veja
you (sg)	vês	vejas
he/she	vê	veja
we	vemos	vejamos
you (pl)	vedes	vejais
they	veêm	vejam

In order for the morphological parser to parse the present subjunctive, we need to define a stem name for the present subjunctive and use it as the stem from which all present subjunctive verbs are built. We will then add this stem as an allomorph to the lexeme /ver/. Also we will define a template for present subjunctive verbs and populate it with the present subjunctive person.number suffixes.

I am sorry that I do not have a Flex project where I have modeled Portuguese, but I am certain that I could do it. I think FLEx has everything needed. If you have other specific questions about something that is not working for you, feel free to write back. I do have examples written up for the Nilo-Saharan language and English that I could share with you if you are interested, but the examples might be a stretch to see how they apply to your situation.

Jeff Shrum

Language Technology Consultant

Dallas, TX

--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/CAMvzDN%2BS4GbE3p4xZiAqR-h7RaujAmKkkUcK3-WdfwZCNvGvFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Jeff Shrum

unread,

Nov 18, 2015, 11:37:01 PM11/18/15

to flex...@googlegroups.com

Zach, I think I can add more about the follow question where you asked:

What is the difference between setting up inflection features with inflection slots for nominal/verbal/adjectival suffixes as opposed to giving each individual suffix a separate entry in the lexicon (which could be hidden during publishing)? Will this affect the parser? Is setting up inflection features & templates easier in the long run?

If you are going to use the morphological parser then you will need to use grammar templates. I would set up a template for each word class, each template will have a slot for “case-suffix” or whatever label you like. I would assign each suffix to the slots in which it can occur, then assign the inflectional feature of nominative, genitive, dative, and accusative for each one.

You show some suffixes that are homophonous. I saw several that could be either nominative or dative for instance. (this is very common in Greek paradigms.) I would enter the nominative and dative morphemes separately in the database. Some would disagree with me about “splitting” these morphemes apart, but this approach will be the best for making the parser work. If the parser suggests too many incorrect parses, you can use “ad hoc” rules to prohibit a feminine suffix from being proposed for a masculine noun stem or the like.

As you know infected forms are not usually included in a dictionary, though sometime irregular forms are included depending on the purpose of the dictionary. Sometimes dictionaries will have a basic grammar in the front to aid users to find words in the dictionary. For example the irregular Portuguese verb “ver” has the first person present indicative of “vejo”. A tourist who hears “vejo” would not be able to find definition of the word unless she knew Portuguese grammar and that there is an inflectional class of verbs that end in “-jo” instead of the much more common “-o”. Have you decided the purpose of the dictionary and how many forms and affixes you want with each head word? Have you considered including a grammar section with the most productive paradigms listed in the front matter and only having those forms that are not predictable as subentries under the headword?

Jeff Shrum

Zach Wellstood

unread,

Nov 19, 2015, 10:34:04 AM11/19/15

to flex...@googlegroups.com

Hi Jeff,

Thank you very much for your responses. It's good to know that we are on the right track with how we've structured our lexicon, and the example from Portuguese was very helpful. We will have a fair number of cases like that, so stem names are a must.

Have you decided the purpose of the dictionary and how many forms and affixes you want with each head word? Have you considered including a grammar section with the most productive paradigms listed in the front matter and only having those forms that are not predictable as subentries under the headword?

These are good questions that we are still thinking about. Since the dictionary will be for use by both linguists and speakers, making the information as accessible as possible is best.

I do have examples written up for the Nilo-Saharan language and English that I could share with you if you are interested, but the examples might be a stretch to see how they apply to your situation.

Even if the examples are tangentially related, it helps us just to see what is possible in FLEx. Would you mind sending over some examples? You can send them off-list if you prefer.

Thanks!

Zach

--

You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/000001d12283%24ed99b9f0%24c8cd2dd0%24%40sil.org.

Jeff Shrum

unread,

Nov 19, 2015, 5:44:20 PM11/19/15

to flex...@googlegroups.com

Zach,

Since you are interested in my workshop materials I have begun putting them on our wiki site: Lingtran.net. I have so far uploaded PDF’s of the sessions that I think you would be most interested in. There is a module on stem names but it needs more work. I do not have all of the data uploaded yet so you will not be able to actually do the exercises, but please have a look. If these help you as is, great. If you need some of the data to actually try the exercises let me know and I will double check all of the data and upload it as soon as I am able. The link to the wiki site is:

http://lingtran.net/FLEx+8+Morphological+Parser

Jeff Shrum

SIL

Reply all

Reply to author

Forward