where to store notes

58 views
Skip to first unread message

Kevin Edgerton

unread,
May 3, 2016, 3:00:38 PM5/3/16
to FLEx list
Hi,

I'm trying to figure out which lexical fields in FLEx are best for storing notes on lexical/semantic/etymological info from reviewers.  These notes have mainly to do with a word being borrowed or "pure P" language, high/literary or low/street register, standard or regional dialect, etc.

What would be the pros & cons of options for listing borrowed words (as well as dialectal variants) in this P database?  One question is whether the dictionary should be more "pure" or more widely descriptive, taking in these other forms.  How/where non-standard P words appear might be handled adequately by use of "show minor entries," if it wouldn't be too much work later to sort out the foreign and/or dialectal words.  (I want to be able to identify non-standard P-language words and separate these entries from standard P words--maybe through Bulk Edit Entries-List Choice?)

Should those borrowed words have their own lexical entries ("equal footing"), or simply appear in a custom field without a separate lexical entry?  (I wasn't planning to enter foreign words--whether as lexical items or in a sub-field such as a note--unless they are used in an MTT draft text or reviewer suggestion.)  

One socio-linguistic issue is the extent to which a word has been accepted into modern P.  Some of our reviewers have a purist mindset and don't want to use any non-P (foreign or other national language) words where a suitable (albeit high-register) P word exists.  (The borrowed word may not have reached "inherited" or native status, if I'm understanding that word right.)  The problem with high-register words is that we're aiming at a middle-school level, and sometimes the borrowed word is the better understood of the two.  

Other reviewers prefer the borrowed word, either because it is more common in general, or because it is dominant in their (capital city) dialect.

So for labeling, my options seem to be:  Etymology, Cross-Reference & creating a field Foreign Borrowing (all entry-level); and Variant (type:  Foreign Borrowing) & Lexical Relations/Synonym) (both sense-level).  I realize these are not the same, but understanding how others use these categories would be helpful.

Other considerations:

1. I read the following in Ron Moe's Intro to Lexicography:
Note that there is currently no field in Language Explorer specifically devoted to borrowed words. An expansion of the etymology section has been requested. Until the programmers can implement it, you can use the Etymology field to indicate borrowed words. However it would be wiser to set up a custom field for them in order to keep inherited words and borrowed words separate in the database. In order to fully specify a borrowed word, you could set up separate custom fields to indicate the source language, the form in the source language, and the original meaning in the source language.

 a. What would be the reason for "keep[ing] inherited words and borrowed words separate in the database"?

 b. The suggestion to "set up separate custom fields" would be done where?  At the entry or sense level?  Is this what I did when I created an entry-level field "Borrowings"?  I understand that if I label a word Variant or Synonym under Lexical Relations, a box opens to create a new entry...but if I put such a word in a specially created field such as "foreign borrowing," then no separate lexical item will be created.  Is this correct?

 c. In light of the following, might it be better to treat borrowed words used in some P varieties as dialect synonyms or borrowed words?

Example (16) cannot be handled in this way because it is not really a variant. A variant is an alternate form of the same lexeme. But 'lift' and 'elevator' are different lexemes. The relation between them is actually more similar to that of synonyms, except that in this case one word is British and the other American. Another difference is that this kind of link can be between senses of a lexeme. In the case of 'lift' and 'elevator' each lexeme has other senses. Only one sense of each lexeme is involved in the variant relation. So rather than call these 'variants', we call them 'dialect synonyms'.

You have to handle dialect synonyms in the same way as you would regular synonyms using the Lexical Relations field on the sense level. You should create a new lexical relation in the Lists--Lexical Relations area. In the Reference set type field specify that it is an Entry/Sense Pair - 2 relation names. This allows you to give an abbreviation and reverse abbreviation for the cross-references in the two entries. If I was producing a dictionary in which British English was the primary dialect, I would give 'American dialect synonym' in the Name field, and give 'Am. var. of' as the Abbreviation. I would give 'British dialect synonym' as the Reverse Name and 'Am. var.' as the Reverse Abbreviation. Then in the entry for 'lift' in the Lexical Relations field I would select Insert British dialect synonym Relation (to this American dialect synonym) from the list of choices. In the Add Reference dialog box I would type 'elevator' in the Find box, click Choose a sense of the entry, then select the correct sense. The program will then add the appropriate cross-reference to the Dictionary view of each entry.


2. Under Choose Usages (sense level), I see the system is to choose from a limited menu; is there an option to add custom fields (as those listed in Moe's paper)?

3. Source (sense level):  Is there a good way/place to distinguish where a word in the dB originated and the source of later comments about the word?  Would both be entered here?  In general, how important is it to keep track in the dB which piece of info came from whom?

4. Difficulty of searching for non-lexical entries

Thanks for any thoughts anyone has!

Kevin

H. Hirzel

unread,
May 5, 2016, 3:00:35 PM5/5/16
to flex...@googlegroups.com
Hi Kevin

My understanding is that 'borrowed words' should have their own lexical entry.
What is needed are notes of the source language, the word in the
source language and the meaning in the source language.

Sorting out borrowed words is always possible later with filters and
bulk edit in FLEx. This assumes that the words are marked properly (in
a particular note/etymology field, or a custom field).

There have been earlier discussions on how to handle borrowed words:

-------------------------------
Mayank Jain, [FLEx] Field for borrowing, 11th Feb 2016
<jnu.m...@gmail.com>


---------------------------
Ronald Moe, <Ron...@sil.org> Fri, Oct 19, 2012

The etymology of a word is merely its history, as best as we can
determine it, given the historical evidence we have. So etymology
properly includes both inherited and borrowed words.

....
You can do one of two things <for borrowed words>. You can use the
current Etymology fields or you can set up custom fields. The default
language for the Etymology field is the vernacular.


To adequately handle borrowed words, we need the following:


A way to introduce the etymology/borrowing (e.g. from/borrowed from/possibly)

A way to indicate the source language (e.g. Hindi/Italian idiom)

A way to indicate the form in the source language (e.g. pāejāma)

A way to indicate the gloss of the source form (e.g. trousers)

A way for the lexicographer to make a note to himself

A way to indicate the bibliographical source of the data

-----------------------------------------------------------------------------------------------------------------------


Regards
Hannes Hirzel
> *Etymology* field to indicate borrowed words. However it would be wiser to
> set up a custom field for them in order to keep inherited words and
> borrowed words separate in the database. In order to fully specify a
> borrowed word, you could set up separate custom fields to indicate the
> source language, the form in the source language, and the original meaning
> in the source language.
>
> a. What would be the reason for "keep[ing] inherited words and borrowed
> words separate in the database"?
>
> b. The suggestion to "set up separate custom fields" would be done where?
>
> At the entry or sense level? Is this what I did when I created an
> entry-level field "Borrowings"? I understand that if I label a word
> Variant or Synonym under Lexical Relations, a box opens to create a new
> entry...but if I put such a word in a specially created field such as
> "foreign borrowing," then no separate lexical item will be created. Is
> this correct?
>
> c. In light of the following, might it be better to treat borrowed words
> used in some P varieties as dialect synonyms or borrowed words?
>
> Example (16)
> <imap://earthsojourners%40luxsci%2E...@secure-email-13.luxsci.com:993/fetch%3EUID%3E/Sent%3E2120#xLift>
>
> cannot be handled in this way because it is not really a variant. A variant
>
> is an alternate form of the same lexeme. But 'lift' and 'elevator' are
> different lexemes. The relation between them is actually more similar to
> that of synonyms, except that in this case one word is British and the
> other American. Another difference is that this kind of link can be between
>
> senses of a lexeme. In the case of 'lift' and 'elevator' each lexeme has
> other senses. Only one sense of each lexeme is involved in the variant
> relation. So rather than call these 'variants', we call them 'dialect
> synonyms'.
>
> You have to handle dialect synonyms in the same way as you would regular
> synonyms using the *Lexical Relations* field on the sense level. You should
>
> create a new lexical relation in the *Lists--Lexical Relations* area. In
> the *Reference set type* field specify that it is an *Entry/Sense Pair - 2
> relation names*. This allows you to give an abbreviation and reverse
> abbreviation for the cross-references in the two entries. If I was
> producing a dictionary in which British English was the primary dialect, I
> would give 'American dialect synonym' in the *Name* field, and give 'Am.
> var. of' as the *Abbreviation*. I would give 'British dialect synonym' as
> the *Reverse Name* and 'Am. var.' as the *Reverse Abbreviation*. Then in
> the entry for 'lift' in the *Lexical Relations* field I would select *Insert
>
> British dialect synonym Relation (to this American dialect synonym)* from
> the list of choices. In the *Add Reference* dialog box I would type '
> elevator' in the *Find* box, click *Choose a sense of the entry*, then
> select the correct sense. The program will then add the appropriate
> cross-reference to the *Dictionary* view of each entry.
>
>
> 2. Under Choose Usages (sense level), I see the system is to choose from a
> limited menu; is there an option to add custom fields (as those listed in
> Moe's paper)?
>
> 3. Source (sense level): Is there a good way/place to distinguish where a
> word in the dB originated and the source of later comments about the word?
>
> Would both be entered here? *In general, how important is it to keep track
>
> in the dB which piece of info came from whom?*
>
> 4. Difficulty of searching for non-lexical entries
>
> Thanks for any thoughts anyone has!
>
> Kevin
>
> --
> You are subscribed to the publicly accessible group "FLEx list".
> Only members can post but anyone can view messages on the website.
> ---
> You received this message because you are subscribed to the Google Groups
> "FLEx list" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to flex-list+...@googlegroups.com.
> To post to this group, send email to flex...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/flex-list/f5d4888c-5a6a-418c-b800-48ff79571a6a%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Ron Moe

unread,
May 6, 2016, 2:21:28 PM5/6/16
to flex...@googlegroups.com
Hi Kevin,
I've been writing on the issue of etymology for many years. I must confess that it is a bit frustrating to me that the etymology area has not been improved. But the programmers have many demands on them. I greatly appreciate all they have done. They truly fit the description, "Over-worked, under-paid, under-appreciated."

You quote me as saying, "However it would be wiser to set up a custom field for them in order to keep inherited words and borrowed words separate in the database." I'm not sure why I said this. The history of a word can involve both borrowing and then later development along with other inherited words. English borrowed words from Old Norse during the Viking era. Those words later changed along with our inherited Germanic words. The term "inherited" means that the word was never borrowed but was passed from generation to generation clear back to Proto-Indo-European. There are sometimes different issues in documenting a borrowed word as opposed to an inherited word. That may be why I said it would be wiser to keep them separate. On the other hand, since a word can be borrowed and then go through changes along with inherited words, it is necessary to be able to document both types of history for the same word. So I would say that the two systems need to be integrated in FLEx. One reason to keep them separate is so that you can sort/filter/search on one or the other. But FLEx is powerful enough that I don't think there is a good reason to keep them separate. So I'll recant on the quote above.

Having said that, I understand that the distinction between inherited and borrowed words can be politically charged in some languages. When a group of people are trying to preserve their language against strong pressure from a dominant language, that issue can be very emotional. However in lexicography we usually are trying to document how a language is used, rather than trying to legislate how it should be used. There have been cases in history when a language (e.g. French, Hebrew) set up a committee to legislate language use. Those committees can be fairly effective when there is popular support. But most of language use is not conscious and not subject to enforced decree. In your case you will have to educate your translators, reviewers, and readers about language use, communication adequacy, and other issues. If a particular word doesn't communicate to the audience (or part of the audience) then you will have to make painful decisions.

You should only create a lexical relation "dialect synonym" for words such as "elevator/lift" that are 1) synonyms, 2) for only one sense out of many, 3) the synonyms are used by different dialects. This is not something you would do for borrowed versus inherited words.

So, I'm assuming that documenting a word's history is important to these discussions. I'll give you some recommendations, but keep in mind that I can only make recommendations on the basis of limited understanding of your situation.

1. Every word being used in the language should be entered into FLEx and given its own entry. If you don't create an entry for a word, you can't describe it. If you try to describe it in a note field in some other entry, it will get lost. FLEx won't let you create a cross reference to a non-existent entry. So there are good reasons for creating an entry for every word.
2. FLEx has a way to exclude an entry from publication. You will find the relevant fields in Lexicon-Entry-Publication Settings (at the bottom of the screen). There are various reasons to exclude a word from a published dictionary, including obscene, rare, uncertain meaning, archaic, etc.
3. If a word is borrowed, put the source language name in the Etymology-Comment field. Then use Tools-Configure-Dictionary to make sure that the field is printed. You can format the fields by ordering them and adding punctuation and spacing. (The Source field is for bibliographical information, not the source language. But you could use it and put bibliographical info in the Comment field.)
4. If you have a borrowed word and an inherited word for the same concept, enter both into FLEx. Then link the two as synonyms. If they are related word to word, then link them using the Cross References field on the entry level. If they are related sense to sense (only one sense is shared), then link them using the Lexical Relations field on the sense level.
5. Put observations about the sociolinguistic usage of a word in the Sociolinguistics Note field on the sense level. If you want an entry level field, you can use the Note field on the entry level.
6. If you need a custom field, use Tools-Configure-Custom Fields. You should create the custom field on the entry level if it relates to the whole entry. Only create a sense level custom field if it only relates to a single sense. Only create a custom field when none of the standard fields fits. FLEx knows what to do with standard fields (if you use them for the intended purpose). It doesn't know what to do with custom fields.
6. There is no automatic way to indicate who created an entry or modified it. The Date Created and Date Modified fields are automatically filled in and may give you a clue. My guess is that it would be very difficult to remember to always fill in such information. If you think it is important enough, you could use the Note field on the entry level, the Source field on the sense level, or create a custom field.

If we haven't answered all your questions, ask again.
Ron Moe

--

Kevin Edgerton

unread,
May 12, 2016, 12:07:39 PM5/12/16
to FLEx list
Thank you for your helpful reply, Hannes!
Kevin

Beth-docs Bryson

unread,
May 24, 2016, 8:33:32 PM5/24/16
to flex...@googlegroups.com
Hannes, thank you for finding this important quote from the Etymology discussion in October 2012.

As part of our current work on the Dictionary area of FLEx, we will be adding some fields and adjusting others.  You'll be happy to know that we are finally adjusting the Etymology cluster.

What we are planning to do is much like what is outlined below.  That is, the Etymology cluster will now consist of these fields:

Pre-comment: for an annotation indicating the beginning of an Etymology  ( "from", "<", "borrowed from", etc.)  NEW FIELD
Source Language: what language it came from  (see note below)
Source Form: the word it came from.  Can record in any analysis or vernacular Writing System (WS). (EXISTING FIELD, formerly called Etymology)
Gloss: the meaning of the source word.  (In any analysis WS.)  (EXISTING FIELD)
Post-comment: To be published in the dictionary entry after everything above.  (Any analysis WS)  (EXISTING FIELD, formerly called Comment)
Note: Notes to the dictionary compiler; not to be published. (Any analysis WS)  (NEW FIELD)
Bibliographic Source: Bibliographic evidence for this proposed etymology.  (see note below)

And we are changing it so that there can be more than one Etymology cluster in the same entry.


In planning how to do this, there were a number of questions.

1. It is apparent that some people have used the existing Source field to record the source language and others have used it for true bibliographic source references.  In looking at around 100 databases, it seemed roughly 75% used it for language names, 25% for bibliographic info, and a handful used it for both (different records in the same project).  We have decided that we will migrate the existing Source field into "Source Language" rather than into "Bibliographic Source".  This will mean that some number of users will have to adjust their data, but our hope is that with Bulk Edit this will not be hard to do, and that it will be immediately obvious that they need to when they are looking at the data..

2. It is not clear whether existing users have used "Comment" for comments to be published in the dictionary entry, or for notes to themselves.  At the moment we are planning to migrate this field to "Post-comment", which is intended to be published.  Again, hopefully it will be easy with Bulk Edit to change if that is not how you have done it.

3. I am open to suggestions of other possible names for the fields I have called "Pre-comment" and "Post-comment" above.

We hope this will meet the needs better than what we have had so far..

Beth Bryson
for the FieldWorks Dev Team



On Thu, May 5, 2016 at 2:00 PM, H. Hirzel <hannes...@gmail.com> wrote:
Hi Kevin

<snip>

Kevin Edgerton

unread,
May 31, 2016, 2:43:27 PM5/31/16
to FLEx list
Hi Ron,
Thank you so much for your detailed reply.
1. One point I'm a little confused on sounds like a contradiction.  You said, "You should only create a lexical relation "dialect synonym" for words such as "elevator/lift" that are 1) synonyms, 2) for only one sense out of many, 3) the synonyms are used by different dialects. This is not something you would do for borrowed versus inherited words."  But in #4 you say, "If you have a borrowed word and an inherited word for the same concept...link the two as synonyms...using the Lexical Relations field...."
2. Since it isn't my goal to delve into the question of whether two words are related at the word or only sense level (and I may not know), would it be reasonable to generally just link them at the word level, and then if it becomes clear later that the relation is only sense-to-sense, change it then?
Kevin

Kevin Edgerton

unread,
Jun 1, 2016, 2:47:21 PM6/1/16
to FLEx list
Ron (and others),

I'm still thinking through the question of where to put the comments of multiple MT speakers on any given lexical entry.  Since potentially each person hails from a different region/dialect and has a different opinion on any given word, it seems important to be able to note both the comment and its source.  Besides the source of the comment, I'd like to keep track of the source of the word, as in how it came into the database (e.g. a particular set of texts).

An example of one entry (HEL-1 is a code name):

General Note:  HEL-1:  not used in P2
Source:  DEL (set of texts)
Usages:  non-standard (based on HEL-1's comment)

It gets more complicated if others also make a comment on the same word, esp. if their comment is different.  I suppose one option is to just start a list under General Note with others' code names and comments on the same line...but Usages only allows one choice, and there is no way to indicate the basis for that categorization (i.e. WHO says that).

Are there other suggestions on how to use the existing fields, or whether a new field is recommended?

Is keeping track of who says what about an entry even something other FLEx users are doing or see as important?

I'd appreciate a bit of guidance on this.

Kevin

Beth-docs Bryson

unread,
Jun 2, 2016, 4:03:44 PM6/2/16
to flex...@googlegroups.com
All of that sounds reasonable.

Note that the Usages field in FLEx is a list field, meaning that it can only be filled in with values that exist in the "Usages" list.  (This often creates confusion during import from Toolbox.)  This is for folks who want a controlled set of values for that field.  This is consistent with how it was done in LinguaLinks.  This would not be ideal for the example you gave, where you included info about why you called it non-standard.

Many users also create a custom field called "Usage Notes" where they can write freeform notes about the usage of an entry.  This is consistent with how it is done in MDF, and this sounds like it is more what you are looking for.

In a future version of FieldWorks, we will have both kinds of Usage fields.

-Beth


Kevin Edgerton

unread,
Jun 2, 2016, 4:41:03 PM6/2/16
to flex...@googlegroups.com

Thanks for that input. What is MDF?
Kevin

> You received this message because you are subscribed to a topic in the Google Groups "FLEx list" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/flex-list/V4lsMXgaj5g/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to flex-list+...@googlegroups.com.


> To post to this group, send email to flex...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/CAEj%3DFHpM8joKZtZxWb_Tt1Xf6isOSyEU%3DomJY%2Bx6s%2BknB8X%3D8Q%40mail.gmail.com.

David Rowe

unread,
Jun 4, 2016, 5:33:24 PM6/4/16
to flex...@googlegroups.com
MDF = Multi-Dictionary Formatter (http://www-01.sil.org/computing/shoebox/MDF.html), a standard used with Toolbox (and its predecessor Shoebox) for entering dictionaries. 
Reply all
Reply to author
Forward
0 new messages