Kwak'wala Orthography(ies)

16 views
Skip to first unread message

Joel

unread,
Feb 12, 2010, 2:47:39 PM2/12/10
to Online Linguistic Database
Hey Kwak'wala-studying people,

What's the deal with orthographies in your language? I seem to
remember you all saying that there were many floating around. Have
you reached a consensus on which orthography to use?

I think the Kwak'wala OLD should use a single orthography for
inputting data. Lacking such consistency, the system will be far less
usable. If need be, it wouldn't be too difficult to write converters
from our "standard" orthography to the other(s). Any thoughts on
this?

A related point to think about: even within a single orthography there
can be subtle variations in character representation. E.g., in
Salish, ejectives can be represented with COMBINING COMMA ABOVE
(U0313), e.g., k̓, or with COMBINING COMMA ABOVE RIGHT (U0315), e.g.,
k̕̕. I imagine the same complication occurs in Kwak'wala.

So, what methods are people using to write Kwak'wala? I know the
"language geek" (www.languagegeek.com) has created keyboard layouts
for Mac and Windows. Are those what people are using? How about for
Linux - are there any Linux users in the group? And if so, have they
taken the time to create their own keyboards?

Any thoughts on this will be very helpful. Thanks,

Joel

Patrick Littell

unread,
Feb 12, 2010, 3:23:08 PM2/12/10
to Joel, Online Linguistic Database
Hey, Joel.

There are about four orthographies in use right now, in two broad camps -- there's the dictionary ("Grubb") orthography and U'mista orthography, which are close variants of one another, and the linguists' ("NAPA") orthography and that used by School District 72, also close variants.

Luckily all four are notational variants of each other -- they all agree on the basic phonemicization -- which means it's computationally easy to convert one from the other.  (I've implemented it twice, in Python and PHP, and in neither case did it take more than a few hours, most of that trying to remember the subtle annoyances of the different orthographies, like none of them agreeing where apostrophes go.)

You're totally right about a standard input orthography -- parts of the system (like search) will be icky if underlyingly the data isn't in a consistent format. 

Ideally it should be all in ASCII, or at least Latin-1 or something like that, because of the different-encoding-problem that you mention.  There are too many different ways to do ejectives, underlines, accents, etc. in Unicode, and what's worse is that NAPA and SD72 use a character (superscript z) for which there's no Unicode code point.  (In the First Nations Unicode font, they use the private use area -- it's at F131 if I remember correctly, but it's been a while.  The difficulty with this is no other font on earth has a superscript Z at F131 -- visitors to the site would have to install that particular font.)

There are some other issues of interest to programmers that we can talk about sometime, but might not be of interest to the whole group.  In any case, probably an unambiguous ASCII or Latin-1 orthography is our safest bet for input, and then some trivial code can output the results into any of the four orthographies.

-- Pat







--
You received this message because you are subscribed to the Google
Groups "Online Linguistic Database" group.
To post to this group, send email to
online-lingui...@googlegroups.com
To unsubscribe from this group, send email to
online-linguistic-d...@googlegroups.com
For more options, visit this group at
http://groups.google.ca/group/online-linguistic-database?hl=en
You can also visit this group's Google Project Hosting website at
http://code.google.com/p/onlinelinguisticdatabase/.



--
Patrick Littell
UBC Department of Linguistics
Totem Field Studios
University of British Columbia
Vancouver, British Columbia V6T 1Z4
Reply all
Reply to author
Forward
0 new messages