Is there a good way to handle 'words' in an interlinear text that I
don't want to end up in my Lexicon? I'm working on a text right now that
talks about the meaning and etymology of a word. It has various
references to other languages, and even parts of words. Should I just
leave them as unknown, or create a POS category to throw them all into.
These are things I wouldn't ever want to include in a dictionary, but I
want to deal with somehow in the interlinear. How do others handle this?
To give an English transliteration of some of it:
Some people say the word 'gang' comes from the LgX word 'gong'; others
say it comes from 'gandi' in the word Samergandi; others say it comes
from LgY as used in 'kar gang'.
I've run across this before when my name comes up in texts that I'm
interlinerising. Again I feel funny entering that as a lexical entry!
Thanks for your help,
Craig.
This proposal has been put forth, but it's not clear how many users
want it, or how important users consider it to be.
It would be great to hear (a) do users want such a feature, and how
important is it to your work, and (b) how do others work around it?
-Beth
I've added several other entry type categories besides "main entry" to
handle loanwords:
"Codeswitch to [national language]"
"Loanword from modern [national language"
"Geographic Proper Noun"
"Proper Noun"
"Codeswitch" is for items that come up in texts for which I know that
they do possess indigenous words, and the national language word used
may not be known to less bilingual speakers. For this I check the
"exclude as headword" box also. Should we produce a lexicon, I would
remove all of these.
The "loanword" category is for words that are not ancient loans, and
maybe only partially incorporated into the phonology (we have those as
well) but yet are broadly known and used because there is no indigenous
word. This is often technological words, words dealing with government
or other institutions, or logical conjunctions. These I do not exclude
as headwords and I would probably add these to a lexicon, as their tone
category or pragmatic usage in our language sometimes differs slightly
from that in the nat. lg.
Ancient loanwords I just add as main entries, though noting the probable
origins in the etymology field. (Most of the numbers, for example, are
probably ancient loans from an ancient form of the national language.)
Eric
I've added several other entry type categories besides "main entry" to
handle loanwords:
"Codeswitch to [national language]"
"Loanword from modern [national language"
"Geographic Proper Noun"
"Proper Noun"
"Codeswitch" is for items that come up in texts for which I know that
they do possess indigenous words, and the national language word used
may not be known to less bilingual speakers. For this I check the
"exclude as headword" box also. Should we produce a lexicon, I would
remove all of these.
The "loanword" category is for words that are not ancient loans, and
maybe only partially incorporated into the phonology (we have those as
well) but yet are broadly known and used because there is no indigenous
word. This is often technological words, words dealing with government
or other institutions, or logical conjunctions. These I do not exclude
as headwords and I would probably add these to a lexicon, as their tone
category or pragmatic usage in our language sometimes differs slightly
from that in the nat. lg.
Ancient loanwords I just add as main entries, though noting the probable
origins in the etymology field. (Most of the numbers, for example, are
probably ancient loans from an ancient form of the national language.)
Eric
A couple of other people have responded to this giving some good
suggestions. If it is important to you to be able to gloss these words in
your texts, for instance so that they can be published or used in some other
way, then you will need to add them to your lexicon. But this is just a
temporary solution. Ultimately FLEx should provide a way to exclude them
from the vernacular lexical database. You could do this in Toolbox by
setting up a separate database for them and then telling the interlinearizer
to look in this second database as well as the primary vernacular database.
Unfortunately we don't have this option in FLEx yet, so you either have to
leave the word unanalyzed in your text or add it to your vernacular
database.
The trick is to somehow mark these bogus words so that you can later delete
them. Eric Johnson has suggested setting up extra options in the Entry Type
field. This might be an acceptable solution for partially borrowed words,
but isn't a good solution for truly foreign words that get included in a
text. For instance I might say, "The German word for dog is 'hund'." The
word 'hund' does not belong in an English dictionary. On the other hand,
some German words are working their way into English, but are only partly
assimilated. If you read grammars of Biblical Greek, you will frequently
encounter the German word 'aktionsart'. It is used in these grammars as a
technical term. It violates English spelling rules. So we would say it has
not been totally assimilated. If it was assimilated, it might be spelled
'actionsort'. So we might want to include 'aktionsart' in an English
dictionary, but mark it as a 'partial loan' or something like that. But we
wouldn't want 'hund' in the dictionary.
So we need a temporary solution for words like 'hund' until FLEx gives us a
better permanent solution. I would suggest that you use the DDP domain 9.8
'Unclassified and miscellaneous words' as a temporary home for such words.
Add them to your lexicon, classify them under domain 9.8, and then make a
note to yourself to delete them later. You should also mark them using the
"Exclude as Headword" field so they don't inadvertently get published. If
you are already using domain 9.8, you could add another domain 9.9 'Foreign
words in texts'. Later you can filter for all the words in this domain and
delete them.
Just an interesting side note-- In Biblical Greek dictionaries you can find
words like 'marana' and 'tha' that are not Greek words, but are Aramaic
words. The New Testament contains a few Aramaic quotes and for some odd
reason the Aramaic words in these quotes are listed in the Greek
dictionaries right alongside the Greek words. The same is true of all the
Hebrew names that occur in the New Testament. This really makes it fun for
me when I'm analyzing Greek phonology or morphology and these words are
mixed in with Greek phonological patterns and inflectional endings. So I
would really like a way to exclude words in texts from being added to the
lexicon.
Ron Moe
Hi all,
Thanks for your help,
Craig.
No virus found in this incoming message.
Checked by AVG - http://www.avg.com
Version: 8.0.173 / Virus Database: 270.7.6/1712 - Release Date: 10/7/2008
9:41 AM
This could be a useful feature for those of us in cross-border language
situations. We got to talk with some speakers of "our" language in a
neighboring country and could communicate to some degree, but kept
getting hung up on different loanwords from that national language.
Fortunately we had some foreign-friends who spoke that language who
could fill in the gaps. In the future we may want to extract share our
db with colleagues there who would only be interested in the
non-loanwords and analysis.
Eric
I've gone with new categories of Foreign word, and (subcategory) Foreign
name for words that are truly foreign (and wouldn't want in a dictionary.)
I've also added Place name and Personal name as subcategories of Noun.
Place names may or may not be wanted in a dictionary (I think an
appendix of place-name spellings could be good; choosing which places to
include/exclude would come at a later date.) Personal names are for what
would be considered a usual name within the community (not a nat. lg name).
Craig.