apostrophe as a letter

350 views
Skip to first unread message

Catherine Crawford

unread,
Jul 8, 2010, 12:25:12 PM7/8/10
to flex...@googlegroups.com
Dear All,

FLEx does a small but annoying thing by failing to recognise an
apostrophe as a letter in Vernacular (Fulfulde) script. By way of example:

the word na'i (/cows/) gets split in a text chart into na and i as if it
were two separate words.

I have not been able to see how to rectify this yet, but I feel sure it
must have to do with the apostrophe needing to be in a list of
admissible letters for the vernacular. Can anyone help?

Many thanks,
Catherine Crawford

Richard Gravina

unread,
Jul 8, 2010, 1:27:58 PM7/8/10
to flex...@googlegroups.com
Hi Catherine,

If you're using a Cameroon Unicode keyboard, you can use ;g to get a
character that looks like an apostrophe but is treated like a letter, e.g.
na'i.

Richard Gravina

--------------------------------------------------
From: "Catherine Crawford" <catherine...@sil.org>
Sent: Thursday, July 08, 2010 5:25 PM
To: <flex...@googlegroups.com>
Subject: [FLEx] apostrophe as a letter

> --
> You received this message because you are subscribed to the discussion
> group "FLEx list". This group is hosted by Google Groups and is open for
> anyone to browse.
> To post to this group, send email to flex...@googlegroups.com
> To unsubscribe from this group, send email to
> flex-list-...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/flex-list

Kevin Warfel

unread,
Jul 8, 2010, 4:26:37 PM7/8/10
to flex...@googlegroups.com
Catherine,

I don't have a lot of experience in this area, but I know I had to get FLEx to recognize the hyphen (-) as a word-forming character in the language I am working on, and this is how I think I was told to get that to happen. (FLEx does handle my hyphens correctly for me now, so I know it can be done.)

First things first: copy your 'apostrophe' onto your "clipboard" in case you need it later in the process I outline below. (To do this, go to your Lexicon, find a word that has an apostrophe in it, highlight just the apostrophe by clicking and dragging, then use Ctrl-c to copy it onto your "clipboard".) Once you've done that, ...

Go to File > Project Management > FieldWorks Project Properties, then click on the "Writing Systems" tab. You should see a dialog box that includes two panes - one for Vernacular Writing Systems and the other for Analysis Writing Systems. Highlight the language in the Vernacular WS pane that corresponds to Fulfulde (probably called 'Fulfulde', but it might possibly have been set up with a different name) if it's not already highlighted in dark blue. Then click on the "Modify" button to the right. Next, click on the "Characters" tab, then on the "Valid Characters" button. That gets you to the screen (technically, a dialog box) referred to on the Help file page I've written about below (at the end of this message).

What I think is the key thing for you is to get the apostrophe that you use in Fulfulde words to appear in the topmost pane (Word-forming characters). (Be aware that there are a number of different Unicode characters that are similar and all look like apostrophes.) I suspect that it is either not in any of the panes you are looking at as you follow these steps, or it is in the middle pane (Punctuation, Symbols, and Spaces).

If you see the apostrophe in the middle pane, you should be able to right-click on it there and receive an option to make it a word-forming character, at which point it should move from the middle pane to the topmost pane, and your problem is solved (I think).

If you don't see the apostrophe in either of the top two panes, you will need to add it manually to the list of word-forming characters. To do this, click on the "Manual Entry" tab (to the left side of the dialog box). Click on the "Single Character" radio button if it hasn't already got a black dot in it. Click in the white space just to the right of the text that reads "Enter a single base character plus any ..." and use Ctrl-v to paste the apostrophe there that you copied from the word in your lexicon. Then click the "Add" button. That will put that character in one of the panes on the right. If it puts it in the topmost pane, you're done. If it puts it in one of the other panes, try what I suggested in the previous paragraph and see if you can get it to transfer to the top one.

For more information, look up "word-forming" in the Help files that come with FLEx, then choose the page that is entitled "Treat punctuation as word-forming characters". It seems to me that what is discussed there is just what you're describing.

Please let me know if this was helpful or not.

Blessings,
Kevin Warfel

-----Original Message-----
From: flex...@googlegroups.com [mailto:flex...@googlegroups.com] On Behalf Of Catherine Crawford
Sent: Thursday, July 08, 2010 12:25 PM
To: flex...@googlegroups.com
Subject: [FLEx] apostrophe as a letter

Dear All,

Many thanks,
Catherine Crawford

--

Beth

unread,
Jul 9, 2010, 1:38:13 AM7/9/10
to flex...@googlegroups.com
Yes, what Kevin describes below should work.

- I don't know how long ago you created the project. In projects
that were created more recently, the apostrophe is in the word-
forming characters section by default. But if it was created more
than a year or so ago, it wouldn't be. (Saying this for the benefit
of others who may wonder.)

- Do think about whether you really do want to use apostrophe in
your orthography, or some other character that Unicode does recognize
as word-forming. Particularly think about whether you ever need to
use apostrophe for punctuation--that would be a key indicator that
you want something else for the alphabetic letter. As Kevin said,
there are a number of others, including "modifer letter apostrophe"
and "saltillo" (which has both a lower and upper case version). The
key is to try to encourage a standard across everyone using the
orthography if possible (rather difficult with a language as widely
spoken as Fulfulde!!). It would be nice if every time people saw
that symbol, there were the same Unicode codepoint underneath.
However, that may indeed be a hopeless cause, unless a Fulfulde
language committee chose to make it a priority.

-Beth

> unsub...@googlegroups.com


> For more options, visit this group at http://groups.google.com/
> group/flex-list
>
>
>
> --
> You received this message because you are subscribed to the
> discussion group "FLEx list". This group is hosted by Google Groups
> and is open for anyone to browse.
> To post to this group, send email to flex...@googlegroups.com
> To unsubscribe from this group, send email to flex-list-

> unsub...@googlegroups.com

Andreas_Joswig

unread,
Jul 9, 2010, 1:39:06 AM7/9/10
to flex...@googlegroups.com
It is a sad fact that apostrophe is not a letter, but a punctuation
mark, and when you give a computer a punctuation mark, it will treat it
accordingly. So, if you want a letter (say, for a glottal stop), you
will have to give it a letter. Fortunately, as Richard Gravina points
out, Unicode provides letters that look like an apostrophe, and only
those should be used to represent letters. This is an important question
to consider for orthography design - using an apostrophe can get you in
just as much technical difficulties as using a special character like
<ʔ>. Using the ASCII apostrophe on your US keyboard is not a suitable
solution for writing a glottal stop or an ejective mark.

Andreas Joswig

Paul Unger

unread,
Jul 9, 2010, 2:16:46 AM7/9/10
to FLEx list
I'm not convinced one way or the other about how to represent a
glottal stop, but Lichtenberk, in his grammar of Toqabaqita (2008)
writes, "I have decided to use the letter <q> to represent the glottal
stop. The apostrophe is not a letter and does not distinguish between
the lower and the upper cases" (40). In an endnote he adds, "As
Christine Foris put it very aptly some years ago... : 'The apostrophe
is an insult to the consonants.'" I'm not sure if it will catch on
here in the Solomons (where a number of languages do have a glottal
stop), since a number of languages also use <q> for prenasalised /g/
(which is also quite prevalent), but it is an interesting approach.
For what it's worth,

Paul

Eric & Susanne Johnson

unread,
Jul 9, 2010, 5:10:09 AM7/9/10
to flex...@googlegroups.com
A number of government-approved orthographies in our corner of the world
use apostrophes to disambiguate syllable boundaries (which is important
to disambiguate syllable codas or syllable final tone-markiung
consonants from onsets). Back when we created our FLEx db and created
our language settings, FLEx was not yet considering the apostrophe as a
word-forming characters. At that time I requested it be added to the
word forming characters by default because it is more than just used for
punctuation in many languages, even those in which it is not used to
represent a phoneme. I'm glad to hear it is now added.

Even languages like in English and French, the apostrophe is not just a
punctuation mark. When it is used to represent missing letters in
optional contracted forms in English (don't, they're, we'll), or
phonologically mandatory "contractions" in French (l'amour, , c'est)
FLEx treating this as punctuation rather than a word-forming character
is not a big problem, as these are all composed of two morphemes, then
can be divided up an analyzed though these are of course recognized as
single words in spell checkers, dictionaries, etc. But English and
French also both have archaic contractions which require the use of the
apostrophe ("o'clock," "ain't," "aujourd'hui") as a word-forming element
and which cannot be easily broken down into synchronic morphemes. And
of course many loanwords and names routinely are spelled with an
apostrophe in English and French (e.g coup d'état, al Q'aida, N'djamena,
Xi'an). When I type apostrophes in these words and Word or Thunderbird
correctly treat the word as a single word, rather than as a punctuation
error (missing a space after or before the apostrophe, etc.) is that
because these software are swapping out my keyboard's punctuation
apostrophe for one of the pseudo-apostrophes that Andreas mentioned or
are these software sophisticated enough to understand that while the
apostrophe does not represent a phoneme in these languages for most
type-setting purposes it is, in fact, a letter as well as a punctuation
symbol?

Having uses an apostrophe as a word-forming element in FLEx and other
software for our research language for a number of years now, it seems
that if one can live without having capital and small forms for this
particular phoneme, using an apostrophe as a word-forming character to
represent a phoneme (e.g. glottal stop) is not a big problem. We do
avoid using it for internal quotes and use the curly apostrophes
instead. For some applications that cannot automatically convert the
straight apostrophe to the curly apostrophes, we use greater and lesser
than symbols < > and then find and replace. If we add the straight
quotations to the punctuation rules for the writing system, (e.g. I told
her, "He said 'Hi.'") then FLEx will consider the beginning and ending
quote apostrophes as part of the word itself and parse it with the word
as we've told FLEx to treat the apostrophe as word-forming and there
appears no way to tell it to treat the apostrophe as word-forming only
when found word-medially. But as we are not at liberty to reinvent the
official orthography, we must continue to use apostrophes as
word-forming and substitute other symbols for the punctuation, or else
bring in an "apostrophe-like" character and be forced to use a keyboard
converter like Keyman for a writing system that uses only the 26 Latin
letters plus the apostrophe just to get that one look alike character.
So we've chosen to keep the default apostrophe and use substitutes for
the punctuation instead.

However, I wonder whether in the long run the valid characters section
of the writings systems setup will need something more sophisticated at
present, in which it appears that a given symbol cannot be considered
both a word-forming character in certain environments (e.g. word-medial)
and punctuation in other environments (e.g. word-initial and
word-final). Another character besides the apostrophe that should be
treated as both word-forming and word-final is the hyphen. Quite a few
words in my English and French dictionaries have an hyphen as a word
forming character (cross-country, n'est-ce, Port-au-prince). Though I
can add the hyphen as a word-forming character, then I lose it as a
punctuation mark. Also, I'm not sure whether it's possible to tell FLEx
that though a given symbol should be considered "word-forming" because
it does not represent a phoneme it should not be considered in any way
in alphabetization. For example, my Oxford English Dictionary treats
word-forming apostrophes and hyphens as not affecting alphabetization
(so "o'clock falls between "ocker" and "octad," and "crossfire" falls
between "cross-fertilization" and "cross-grain"). But though I've not
added the apostrophe to my custom sort order, it is treated as a
sortable symbol and therefore I get:

ndi
ndi'ndang
ndiag
ndin
ndip

Rather than:

ndi
ndiag
ndin
ndi'ndang
ndip

This is not actually a problem for me as I find it easier for my own use
when words are sorted by like syllables. However, for end-users of a
dictionary someday, it could be a big confusing if they've been taught
to look things up in traditional alphabetical order, and/or are not sure
whether an apostrophe was needed in the sought after word. (The
apostrophe in "ndi'ndang" /ʔdi¹³ʔdaːŋ¹³/ is to disambiguate for the
phonologically possible "ndin'dang" /ʔdin¹³taːŋ¹³/)

So it seems like we more ways to characterize symbols beyond just
“word-forming" and "punctuation," to allow word-forming characters to
also be used as punctuation and excluded from sorting.

Eric

Catherine Crawford

unread,
Jul 9, 2010, 5:42:36 AM7/9/10
to flex...@googlegroups.com
Thanks for all the comments. It is always interesting to see what a
seemingly 'simple' problem throws up! I appreciate the difficulty of
having what is essentially a punctuation mark as a letter. However, the
apostrophe is a recognised letter in the official Fulfulde alphabet here
in Mali, and is well established as such. The apostrophe is not used
for anything else, to my knowledge.

Kevin, your instructions were great and I see the route very clearly.
However, when I look in the Word-forming characters pane there are just
the following: a couple of symbols (circles with exclamation marks in
them), some combinations of capital letters, a hyphen, numbers from 0 to
9 and something that looks very like a straight apostrophe. The latter
suggests that FW should already recognise an apostrophe as a letter.

Is the choice of keyboard significant? I use Clavier du Mali (Keyman
Mali keyboard), linked to Maltese as the system language keyboard.

Catherine

Catherine Crawford

unread,
Jul 9, 2010, 6:10:37 AM7/9/10
to flex...@googlegroups.com
We are fortunate that Mali uses French-style punctuation. Therefore
English-style single inverted commas do not feature. For direct speech
the line is introduced by a long line like an elongated hyphen or else
encompassed by double less than/more than symbols (which my keyboard
seems unable to produce consistently, to my frustration) as for
quotations. The apostrophe also only appears word-medially, since all
vowel-initial words have glottal onset and this is not marked
orthographically.

I have not so far observed what happens to our alphabetical order, but
will do now.

Catherine

Kevin Warfel

unread,
Jul 9, 2010, 8:51:30 AM7/9/10
to flex...@googlegroups.com
Catherine,

That actually sounds a lot like what mine looked like, but I thought mine was perhaps unusual, so didn't mention it. Try this approach, which worked well for me yesterday when I was doing my research in order to respond to you :

Follow my instructions as before: Go to File > Project Management > FieldWorks Project Properties, then click on the "Writing Systems" tab. You should see a dialog box that includes two panes - one for Vernacular Writing Systems and the other for Analysis Writing Systems. Highlight the language in the Vernacular WS pane that corresponds to Fulfulde (probably called 'Fulfulde', but it might possibly have been set up with a different name) if it's not already highlighted in dark blue. Then click on the "Modify" button to the right. Next, click on the "Characters" tab, then on the "Valid Characters" button.

Now, instead of clicking on the "Manual Entry" (at left), click on "From Data". Then click on the "Scan" button. This will scan all of your Fulfulde text material and do an inventory of the characters it finds - letters, numbers, punctuation, and all the different combinations of letters and diacritics (but I'm not sure you have much of that sort of thing in Fulfulde - I worked in Burkina Faso, so am superficially acquainted with the language). Go down through the list and uncheck any that are not valid characters. (I found a few interesting typos of vowels with double tone marks when I did this yesterday, so excluded those and went and cleaned them up in my texts at the same time!) Then click on the "Add" button. That will put in your ɓ and ƴ without you even having to use your Mali keyboard (which doesn't work in this dialog box anyway). You can then try to move the apostrophe to the word-forming group, if necessary, but you may run into the problem that has been alluded to, namely that the apostrophe that is obtained directly from the keyboard is defined as non-word-forming in Unicode, and you may therefore be unable to move it to the word-forming group. It is also possible that Dan Brubaker, who designed the Keyman keyboard you are using, if I'm not mistaken, created it to insert one of the apostrophe look-alikes that have been alluded to, rather than the simple apostrophe that is defined as a "non-letter"; if that is the case, Dan is to be commended for his foresight.

In the event your apostrophes are the "ordinary" ones, here is my suggestion. Since there are other Unicode characters/codepoints that look virtually identical to the apostrophe generated by typing the apostrophe directly from your physical keyboard, talk to Dan about the possibility of altering the Keyman keyboard so that it can produce one of the word-forming apostrophes, then find a way to systematically replace the ordinary apostrophes in your database with the word-forming ones. Your data will *look* no different (so it will conform to the Fulfulde language standards), but it will behave differently (esp regarding your original problem). If you are working with sacred texts, you can use TE's Replace All function to make those changes in the texts. Someone else will have to tell you how to make such a wholesale change in the lexical database or in other texts. As always, however, it is wise to do a back up *before* using the Replace All function.

Now that you know more about the nuts and bolts of your FW project, you can speak or write more knowledgeably about it. I would suggest that you contact Dan and work out with him the exact steps you should take to get from where you are currently to where you want to be. Dan is familiar with all of the issues at work in Mali and can better advise you than any of us outsiders, though I am happy to have been able to point you in the right direction.

Blessings,
Kevin

Catherine

--

You received this message because you are subscribed to the discussion group "FLEx list". This group is hosted by Google Groups and is open for anyone to browse.
To post to this group, send email to flex...@googlegroups.com

To unsubscribe from this group, send email to flex-list-...@googlegroups.com

Jeff and Peg Shrum

unread,
Jul 9, 2010, 10:26:49 AM7/9/10
to flex...@googlegroups.com
Using one of the other 26 letters that is not needed by the phonology of a language is a fine approach rather than using a special character or punctuation character of some kind. In Bantu zone P /h/ has become used for the glottal stop since it is not needed else where, and some languages in the zone do have more of a fricative than a full glottal stop. Similarly, /c/ is often used for the voiceless alveolar affricate since /k/ is sufficient for the voiceless velar stop. Such a system is easier to read and manipulate on the computer (which is biased towards English orthography), but unfortunately there are many "legacy" orthographies that will be hard to change.

Jeff S.
Milange, Mozambique

Paul

--

Jeff and Peg Shrum

unread,
Jul 9, 2010, 10:26:49 AM7/9/10
to flex...@googlegroups.com
Glad to hear that apostrophe is a word-forming character because the string
/ng'/ is a common trigraph in Bantu orthographies no matter how much I or
others would like it otherwise. It is also common to use the apostrophe to
mark syllabic nasals which are formed by a process of vowl elision. Marking
the elision helps new readers, but does create other problem like so many
things in orthography design. I have not seen a Bantu language in southern
or eastern Africa that uses apostrophe as punctuation, so it is fine putting
it in the category of "word-forming" as the default in my opinion. Are
there languages other than European ones, that use it as punctuation? The
point of Flex is to study the least studied languages isn't it? Let's make
the defaults to support this type of research.

Jeff S.
Milange, Mozambique

-----Original Message-----
From: flex...@googlegroups.com [mailto:flex...@googlegroups.com] On
Behalf Of Beth
Sent: Friday, July 09, 2010 7:38 AM
To: flex...@googlegroups.com

-Beth

flex-list-...@googlegroups.com

Beth

unread,
Jul 9, 2010, 4:32:30 PM7/9/10
to flex...@googlegroups.com
On Jul 9, 2010, at 2:42 AM, Catherine Crawford wrote:

> Kevin, your instructions were great and I see the route very
> clearly. However, when I look in the Word-forming characters pane
> there are just the following: a couple of symbols (circles with
> exclamation marks in them), some combinations of capital letters, a
> hyphen, numbers from 0 to 9 and something that looks very like a
> straight apostrophe. The latter suggests that FW should already
> recognise an apostrophe as a letter.
>
> Is the choice of keyboard significant? I use Clavier du Mali
> (Keyman Mali keyboard), linked to Maltese as the system language
> keyboard.

When you are in that dialog, hover over the character you are
wondering about. If you wait long enough, a "tooltip" will appear,
showing you the Unicode value and name of the character you are
hovering over. If it is the normal ASCII apostrophe, it will say U
+0027 Apostrophe. If it is the saltillo, you will find that out.

If the apostrophe is not there, try adding it according to Kevin's
earlier recommendations. Note that there is also a place at the
bottom of the dialog to add things by Unicode value, rather than
typing in the character. There you could type in 0029, to be sure
you're getting the right one in.

In his more recent message he suggests using a different Unicode
character that looks like apostrophe but isn't. However, I would
only recommend doing that if that is the convention for all of
Fulfulde. However, it sounds like most people who type Fulfulde use
the simple apostrophe, and it also sounds like it doesn't figure as a
punctuation character. That sounds just fine. In that case, I
highly recommend using the same character everyone else uses.
Sorting and searching gets very confusing when different people type
different things for the same character. It's in a language group's
interest to define a standard for which Unicode characters to use for
that language, and to encourage the use of that standard.

It may be that you need to speak with Dan Brubaker about helping you
get the left and right guillemet symbols (the double angle quotes)
out of your keyboard. It is much better to use the real Unicode
values for those (U+00AB, U+00BB and U+2039, U+203A) rather than
just two greater-than or less-than signs. It should be quite
possible to make the keyboard produce those.

-Beth


Andreas_Joswig

unread,
Jul 13, 2010, 2:02:12 AM7/13/10
to flex...@googlegroups.com
We have to keep two things separate. One problem is the behavior of
FLEx, and admittedly this is the point of this thread. You can tweak
FLEx to accept an apostrophe as a word-forming character. But the other
thing is that, no matter how much we like or dislike the fact, the
apostrophe as provided by U+0027 in the Unicode IS a punctuation
character. FLEx is not the only software which will cause you grief if
you want to use it as anything else than a punctuation mark. Microsoft
Word for example has the sickening habit of changing it to a different
shape. My German computer by default will place the character upside
down at the baseline of the word. An apostrophe by default has a
particular behavior, and you need to dig deep down into the bowels of
your computer to get rid of that behavior. You may think you can do that
easily with your machine, but imagine you pass on a printer-ready file
of a publication to a print-shop in your country which happens to have
its computer donated from Germany. You will be quite disappointed by the
result, and probably blame the stupid print-shop owner.
That's why I maintain that using the regular apostrophe as a
word-forming character in an orthography, possibly to avoid using other
characters which on first glance may appear to be more complex to
access, is generally a very bad idea, except in the most controlled
circumstances. To avoid trouble you will have to use the special
apostrophe-like characters provided by the Unicode, and for that you
will have to employ a keyboard solution for the language.
Andreas
Reply all
Reply to author
Forward
0 new messages