Google Groups Home
Help | Sign in
Questions on plug-ins and the "cin" format in Leopard
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Eric Rasmussen  
View profile
 More options Nov 8 2007, 11:25 am
From: Eric Rasmussen <keras...@mac.com>
Date: Thu, 8 Nov 2007 11:25:45 -0500
Local: Thurs, Nov 8 2007 11:25 am
Subject: Questions on plug-ins and the "cin" format in Leopard
Well, updating the Yale site for Leopard turned out to only take a few  
hours, since I no longer need to do little intros and guides to the  
input methods. But it's not yet uploaded, and won't be for a while,  
because of one remaining obstacle:

Plug-in input methods.

This has been changed somewhat in Leopard -- we've already seen what  
may be a fairly serious bug possibly related to wrongly assigning the  
".cin" extension to a file that should have the ".inputplugin"  
extension. My current thinking is that ".inputplugin" is for the Apple  
format that used to be converted into a ".dat" file by the Input  
Method Plug-in Converter. But I don't know anything about what  
constitutes a valid ".cin" format file -- how are they different from  
the Apple format? Do they really work in Leopard? Or are there  
limitations?

I know ".cin" originated with the Xcin project for X11. I also know  
OpenVanilla is designed to use the ".cin" format.

Also, I'm moving OpenVanilla to its own section on the input methods  
page. It doesn't quite fit in with the other input methods, since it  
is a framework. I may put it in a new section called "Plug-in  
Frameworks", containing OV and the Apple plug-in mechanism. Does that  
make sense?

Anyhow, any advice, wisdom, and/or links to information about the  
".cin" format, OV, and the like would be most welcome!

Eric


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Lukhnos D. Liu  
View profile
 More options Nov 9 2007, 12:25 pm
From: "Lukhnos D. Liu" <lukh...@gmail.com>
Date: Fri, 09 Nov 2007 09:25:00 -0800
Subject: Re: Questions on plug-ins and the "cin" format in Leopard
On Nov 9, 12:25 am, Eric Rasmussen <keras...@mac.com> wrote:

> This has been changed somewhat in Leopard -- we've already seen what  
> may be a fairly serious bug possibly related to wrongly assigning the  
> ".cin" extension to a file that should have the ".inputplugin"  
> extension. My current thinking is that ".inputplugin" is for the Apple  
> format that used to be converted into a ".dat" file by the Input  
> Method Plug-in Converter. But I don't know anything about what  
> constitutes a valid ".cin" format file -- how are they different from  
> the Apple format? Do they really work in Leopard? Or are there  
> limitations?

I'll begin with some history that I know.

.cin was first introduced by Xcin, an input method framework for X11
developed in the mid 1990s, as a data format for table-based input
methods. By table-based I mean input methods that can be implemented,
or seen, as a table look-up mechanism. Around 90% of input methods
(Chinese and beyond) can be implemented that way. Apple's .inputplugin
also belongs to that category. Almost every mainstream input method
framework supports at least one form of user-customizable IME
creation. .cin seems to have become one of the standard data formats
because it's simple and many user-generated tables are already in wide
circulation.

I have very limited knowledge of Xcin and other frameworks, but in the
early days, .cin was intended as a source format, not to be consumed
directly by input method framework (or more precisely, the table-based
input method "generator"). Also back then a .cin could use any
encoding recognized by the framework. So phone.cin (renamed to
bpmf.cin in OV) was encoded in Big5, pinyin.cin in GB, and so on.

When we were developing the "generic" module (first named OVIMXcin,
later renamed to OVIMGeneric) to support .cin in OpenVanilla, we made
two decisions: first, we no longer require user to run a compiler/
converter to make .cin into a binary format, as it was so, which means
the .cin is consumed by the input method module directly. Second,
all .cin files must use UTF-8 encoding. This opened the door to bigger
character set and the famous "♨" input method.

So what constitutes a valid .cin file? For OpenVanilla, a .cin file
consists of three sections:

1. a header consisting of directives beginning with "%", like %ename,
%selkey, %endkey. Some of them are like meta-data, some of them are
controlling directives.

2. a keyname block between the directives "%keyname begin" and
"%keyname end". This tells the generic input method to map the key
typed to a character displayed in the composing stage (mostly to
represent radicals in radcial-based input methods).

3. a chardef block between the directives "%chardef begin" and
"%chardef end". This is the body of the data table. "chardef" is
somewhat an anachronistic misnomer. It used to define the relationship
between key sequences to characters (hence the name), but modern
implementations like OV and gcin allow phrases in this block.

Different frameworks have implemented the details somewhat
differently. OV's implementation disallows the use of Windows-style CR
LF (so only the UNIX-style \n is used, and that's also what OS X
uses), and comment lines (beginning with #) is not allowed in the
chardef block.

Although .cin contains enough information for key-character/phrase
mapping, but many input methods (like Cangjei/"Changjei" or Simplex/
Jianyi) require finer control. For OpenVanilla, the control is
provided in the form of input method preferences (with some mind-
bogging names like "force composition when reaching maximum length of
radical" or "use space to select the 1st candidate). Different input
methods require different controls (and those are a must -- failure to
provide those controls yields barely usable input methods). gcin
differs from OV's implementation in that it allows those control
directives to be expressed as a .cin header, with its own directive
extensions.

OpenVanilla's repository of .cin is available at:

  http://openvanilla.googlecode.com/svn/trunk/Modules/SharedData/

Zonble has written an excellent tutorial (in Chinese) on how to create
your own input method by writing up a .cin, which is kind of standard
text now:

  http://docs.google.com/View?docid=ah6d8th954vw_201fd5dkx

Technically .cin is really just a set of key-value pairs with its own
convention. OV makes heavy use of .cin as a format. Things like
reverse radical/pinyin lookup or associated phrases are also done
with .cin-based data tables. I see it a good sign that Apple adopts a
popular (and mostly consistent and cross-framework compatible) data
format for Leopard.

So what about Leopard? As far as I know, dropping in a UTF-8-
encoded .cin into ~/Library/Input Methods or /Library/Input Methods
then re-login just works. A new input method, using the name defined
in the .cin, shows up in the Input Menu tab of the International
preferences panel. I'm not aware of any per-method level control so
far (I might be very ignorant on this).

In terms of limitation, I'm not aware of that either. OV's own
implementation (and many others) is only limited by memory and your
patience (loading a .cin with 200,000 entries on a G3 is no small
thing; a database-backed design will solve the problem). Leopard's own
take should not differ much. So it should be very flexible and easily
customizable.

d.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Lukhnos D. Liu  
View profile
 More options Nov 9 2007, 1:11 pm
From: "Lukhnos D. Liu" <lukh...@gmail.com>
Date: Fri, 09 Nov 2007 10:11:13 -0800
Local: Fri, Nov 9 2007 1:11 pm
Subject: Re: Questions on plug-ins and the "cin" format in Leopard
On Nov 10, 1:25 am, "Lukhnos D. Liu" <lukh...@gmail.com> wrote:

> In terms of limitation, I'm not aware of that either. OV's own
> implementation (and many others) is only limited by memory and your
> patience (loading a .cin with 200,000 entries on a G3 is no small
> thing; a database-backed design will solve the problem). Leopard's own
> take should not differ much. So it should be very flexible and easily
> customizable.

I should quickly point out that both OV and Leopard's .cin
implementations are fast. Loading 200k entries on a G3 is an extreme
case.

d.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
傅可恩  
View profile
 More options Nov 9 2007, 9:08 pm
From: 傅可恩 <oxus...@gmail.com>
Date: Sat, 10 Nov 2007 02:08:11 -0000
Local: Fri, Nov 9 2007 9:08 pm
Subject: Re: Questions on plug-ins and the "cin" format in Leopard
Wow. Thanks for that Lukhnos!

I remember trying to implement Xcin on X11 when OS X first came out
and didn't have good Chinese support. I couldn't get anywhere, so I
really appreciate how much work has gone into making it easy to do
things with OV!

Cheers,

Kerim

On Nov 10, 1:25 am, "Lukhnos D. Liu" <lukh...@gmail.com> wrote:


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google