Pragmatics/semantics is the key to these questions. Two words with apparently the come from the same root but with unrelated meanings are homophones and should be represented by separate entries. In your example, I would go for separate entries even though there is a semantic tie between ghost and place of the dead. I think that referring to a “place” in one instance and to a “being” in the other is semantically too far apart to be considered senses of the same word. A similar example is the use of gender in many European languages. In Portuguese /o sede/ is “headquarters” while /a sede/ is “thirst”. These are clearly homophones.
It is also very common in Bantu for words to only occur in the plural or singular. In my experience many noun class 6 words only occur in the plural. Many are mass nouns such as “water”, but they do not have to be mass nouns. Perhaps there is a since in which “place of the dead” is conceived of a mass noun like “crowd” in English.
Jeff S.
SIL Mozambique
From: flex...@googlegroups.com [mailto:flex...@googlegroups.com] On Behalf Of Cory Shain
Sent: Thursday, January 03, 2013 9:02 PM
To: flex...@googlegroups.com
Subject: [FLEx] Singular and plural forms with different meanings?
I was hoping someone might have insights as to how best to handle a situation in which the plural and singular inflected forms of a root have different meanings. For example, in the Bantu language I'm working on (Yasa), the root /kuku/ takes the class prefix /mo-/ in the singular and /me-/ in the plural (/mokuku/ vs. /mekuku/), like all the other roots in its gender. However, the singular (/mokuku/) means "spirit" or "ghost", while the plural (/mekuku/) means "resting place of the dead." Is there a standard way of capturing this in FLEx?
--
Hi Cory,
Sorry to weigh in late on this issue. It is quite an interesting and complicated problem. To solve it you have to start with the premise that a dictionary is a service to the user. So we have to ask, How is the user best served? One way to answer this is to ask, Where will the user look to find the entry? We want to save the user steps in finding the entry. So if the user wants to find mekuku, will he look under mekuku or mokuku? If you have two entries that are cross-referenced, he will eventually get to the correct entry. If he will recognize mekuku as a plural form and has been taught to always look up a plural form under the singular, then your best solution is to just have one entry. But people often intuitively know when there is no corresponding singular for a plural-only form. So our English dictionaries have an entry “means” (as in “I don’t have the means to do that”) in addition to entries for “mean” (as in “That’s not what I mean,” and “You’re being mean.”) For native speakers of English, having an entry “means” works well. But for non-native speakers, they would probably strip off the plural suffix and look under “mean”. So you have to test your users’ look up strategies and see what they do. That will tell you where to place your entry.
Another question to ask is, Does it help the user if the dictionary combines (sub-)entries together or splits them apart? The answer to this seems to vary from language family to language family. The general rule is that people tend to think in terms of “words” and tend to look up “words”. Of course this means that we need to test what the user thinks of as a word. Over the millennia lexicographers have come to the conclusion that inflection does not create a separate word, but derivation does. We recognize that there is a continuum between inflection and derivation, so that there are in-between cases that are hard to judge. But as a general rule, the inflection-derivation distinction works well.
There is a related issue to this tendency to think in terms of “words” and to look up “words.” In recent years English dictionaries have begun including idiomatic phrases, constructions, and common collocations within the entry for a word. Here is the Longman Dictionary of American English entry for “means”. Their formatting is important and I hope the italics and bold font comes through. Otherwise it will be difficult to make sense of their presentation.
means /minz/ n [plural] 1 a method, system, object et. that is used as a way of doing something: We’ll use any means we can to raise the money. | She took up photography as a means of earning a living. | The oil is transported by means of (=using) a pipeline. 2 by all means said in order to emphasize that someone should do or is allowed to do something: By all means, drink while you are exercising. 3 by no means formal not at all: The results are by no means certain. 4 a means to an end something that you do or use only to achieve a result: Bev always says her job is just a means to an end. 5 the money or things that you have that make it possible for you to buy or do things: They don’t have the means to buy a car. | a man of means (=who is rich)
This entry doesn’t directly address your problem of mokuku/mekuku, but it does illustrate a couple of important things. First, all the multi-word expressions are in bold so that the user can easily scan the article and find what he wants. Second, users benefit from seeing all this information together. Rather than split this entry up into lots of small entries for idiomatic expressions, Longman made the decision to combine them. I believe their decision is sound because it recognizes that users will most likely look under “means” to find information about all these expressions. It also recognizes that these forms are related semantically and the user will think of them as all belonging together. In other words it fits the users’ look-up strategies and linguistic intuitions.
Having worked on Bantu language dictionaries, I know that the tradition is to list nouns under the singular unless it only occurs in the plural, in which case it is listed under the plural. However mokuku/mekuku is an in-between case. I can’t speak with any degree of certainty about where your users will look for mekuku. However I would tend to want to merge such senses into a single entry:
mokuku n 1 [only singular] spirit; ghost 2 mekuku [only plural] resting place of the dead
Then I would create a minor entry for mekuku in case the user looked there first:
mekuku see mokuku
However this is just a best guess. Testing might show that a two entry solution with cross-references is better:
mokuku n [only singular] spirit; ghost cf. mekuku
mekuku n [only plural] resting place of the dead cf. mokuku
FLEx enables you to do either of these.
Ron Moe
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2013.0.2805 / Virus Database: 2637/6009 - Release Date: 01/04/13
mokuku n [only singular] spirit; ghost mekuku n [only plural] resting place of the dead
mekuku see mokuku
(Note that we would get sense numbers if
linking a complex form to a specific sense of its root. I don't
think that's what Ron had in mind here. But if it is, I believe
the Component field in the entry for mekuku would link to a
specific sense of its root entry--a second sense which does not
actually apply to mokuku. I'll append the details in a P.S. just
in case.)
mokuku n [only singular] spirit; ghost cf. mekuku
mekuku n [only plural] resting place of the dead cf. mokuku
This approach should help with parsing texts. (I'm glad Ron
explained this to me years ago.) Supposing that you had an
ordinary word, motutu, which should get its own entry, and metutu,
an ordinary inflected form which should not. You would still want
metutu to be parseable in texts. But as long as you have an entry
for each prefix, a single entry for tutu should be sufficient for
parsing both motutu and metutu. For this language, we'd also
include a citation form for the sake of the printed dictionary.