What is Sinhala unicode

43 views
Skip to first unread message

Donald Gaminitillake

unread,
Mar 26, 2006, 11:27:45 PM3/26/06
to Anti-Sinha...@googlegroups.com
In the year 2004 April 26th The Sri Lanka Standards Institution Drafted
a published SLSI1134 for public comments

Only two groups objected to this
Myself &the other is the Sri Lanka Association of Printers
My objections were rejected by a panel of more than 20 people.
None of the panel had any knowledge in typology or typography or
willing to listen to me.
But this meeting was documented and recorded it in the files of SLSI.

The SLSI 1134 was approved.
This is incorrect and incomplete set of Sinhala Characters

Even before the SLSI 1134 in the year 2003 I made a public lecture at
the University of Sri Lanka
and it can be downloaded from

http://www.cssl.lk/PL/ICT-&-lang3.ppt

SLSI 1134 went ahead with the part of the characters (glyphs) to
construct a character.
This became a problem. text created by one application will not read as
the same text in another application etc etc

Legal Issue:
During this time the of year the Intellectual Property act No 36 of
2003 came into effect.
My concept of individual characters to be used in computer became a
area protected by the above act because---( THE ACT DEFINED -- )MY
PROCESS, SYSTEM & Idea permits in practice -- the solution to a
specific problem in the field of technology.

************

Where is "KU" "REPAYA" "YANSAYA" "LU" "DU" "Kayanna badhi shayanna" to
write the name of our President Rajapaksha.

+++++++++++

Recently I wrote this mail to Executive Chairman of the ICTA

----I quote from my mail --------

Dear Executive Chairman of the ICTA,

I hope you read my article to the Daily Mirror Sat 4 March 2006.
If not it is attached to this mail for your perusal.

Do you know that your SLSI 1134 is incorrect and incomplete?
Do you know that ICTA will never be able to implement the e-SL program
in Sinhala Language and Tamil Language.
Do you know that the Constitution of Sri Lanka requires the Government
to use Sinhala Language and Tamil Language in Sri Lanka.
Do you know that this is a mandatory requirement.
Do you know that google.lk is just a hype
Do you know that without Sinhala and Tamil the e-SL program it self a
hype.

If your SLSI 1134 is the correct solution why cant the content
developers develop the web site in Sinhala Language and Tamil Language?
Please note that computer is not a typewriter. With your SLSI 1134 you
can use a computer as a typewriter but not as a computer.

Until when are you going to fool the general public and the Minister
for IT who is the Hon President of Sri Lanka?

Hope you have the time to visit my web site www.akuru.org.

""" Quote""" from a comment I received

Well asked - but don't expect these people to reply!

The Chairman of ICTA is also the former chairman of CINTEC who created
the current problem with Sinhala fonts. He is the least likely to do
anything to resolve it, because he will then be exposing his earlier
bungling.

Besides, his UCSC 'golaya' yyyyy and his buddy "xxxxxx" have
globe-trotted for a decade saying they are doing Sinhala fonts. All
that will be exposed and brought into question the moment things are
done PROPERLY even at this late stage.

'''UNQUOTE"


All because of SLSI 1134 is incorrect and incomplete.

The image of SLSI 1134 is annexed at the bottom.

These are facts.

Please read understand and act to save Sinhala language.

This problem also is with the Tamil Language.

Once I correct the Sinhala (image1) I will do the Tamil.(see annex
image2)

Best

Donald


slsiencod22.jpg
slsiencod22.jpg
tamil_lang.jpg
tamil_lang.jpg

මම සිංහල

unread,
Mar 27, 2006, 12:31:03 AM3/27/06
to Anti-Sinha...@googlegroups.com
On 3/27/06, Donald Gaminitillake <lankap...@gmail.com> wrote:
In the year 2004 April 26th The Sri Lanka Standards Institution Drafted
a published SLSI1134 for public comments

Only two groups objected to this
Myself &the other is the Sri Lanka Association of Printers

non of IT people have any comments. oh thats shame in Sri Lanka we dont have any intelegent IT personal for see this problems. 

Only our printers could see this disaster
 

My objections were rejected by a panel of more than 20 people.

oh shame again.  Atleast one of them  could see this fact

None of the panel had any knowledge in typology or typography or
willing to listen to me.

oh poor me 

But this meeting was documented and recorded it in the files of SLSI.

The SLSI 1134 was approved.
This is incorrect and incomplete set of Sinhala Characters

Even before the SLSI 1134 in the year 2003 I made a public lecture at
the University of Sri Lanka
and it can be downloaded from

http://www.cssl.lk/PL/ICT-&-lang3.ppt

SLSI 1134 went ahead with the part of the characters (glyphs) to
construct a character.
This became a problem. text created by one application will not read as
the same text in another application etc etc

if those appications  build  acording to  unicode  this should work.  its  there rending  problem.
I also sese this on  firfox.  some times it wont rending  correctly.  that  dont mean  its  problem with unicode set.

Legal Issue:
During this time the of year the Intellectual Property act No 36 of
2003 came into effect.
My concept of individual characters to be used in computer became a
area protected by the above act because---( THE ACT DEFINED -- )MY
PROCESS, SYSTEM & Idea  permits in practice -- the solution to a
specific problem in the field of technology.

************

Where is  "KU" "REPAYA" "YANSAYA" "LU" "DU" "Kayanna badhi shayanna" to
write the name of our President Rajapaksha.


all this can reprecent by unicode.  but not as your 4 digit.
Why we need 4 digit ?

 

donald gaminitillake

unread,
Mar 27, 2006, 1:51:54 AM3/27/06
to Anti-Sinhala-Unicode
Where is "KU" "REPAYA" "YANSAYA" "LU" "DU" "Kayanna badhi shayanna" to
> write the name of our President Rajapaksha.

all this can reprecent by unicode. but not as your 4 digit.

You Admit that above characters are "not defined" in UNICODE.

Why we need 4 digit ?

If you are worried about 4 digits give in the following format
Table , Group, Plane Row DEC HEX and Name

SLSI 1134 is incorrrect.

This is how unicode define locations.

donald gaminitillake

unread,
Mar 27, 2006, 2:10:17 AM3/27/06
to Anti-Sinhala-Unicode
http://www.unicode.org/charts/

Download the sinhala chart

Find the answer by yourself as to why we need the4 digits

Anuradha Ratnaweera

unread,
Mar 27, 2006, 3:03:53 AM3/27/06
to Anti-Sinha...@googlegroups.com
donald gaminitillake wrote:
> Where is "KU" "REPAYA" "YANSAYA" "LU" "DU" "Kayanna badhi shayanna" to
>
>> write the name of our President Rajapaksha.
>>
>
> all this can reprecent by unicode. but not as your 4 digit.
>
> You Admit that above characters are "not defined" in UNICODE.
>
> Why we need 4 digit ?
>
> If you are worried about 4 digits give in the following format
> Table , Group, Plane Row DEC HEX and Name
>
Let's get the terminology correct. I'm sorry if I used the words
"digit" and "code point" interchangeably, which has led to a bit of
confusion. I apologize.

I think what Mr Donald mean by 4 digits is something like this: 0DDC -
0, D, D and C being four digits.

What I was referring to were "code points". 0DDC is a single code
point, so is 200D.

So, you were asking for a single code point for "du" (which is four
digits). My question is why a single code point (four digits) like
0DDC? My answer contains two code points (i.e., four digits followed by
another four digits). What's wrong with that?

Anuradha

Donald Gaminitilake

unread,
Mar 27, 2006, 3:22:24 AM3/27/06
to Anti-Sinha...@googlegroups.com
What is the single code poiint for "DU"
1+2="3"
where is the codepoint "3" for DU

This codepoint (location)  is not given on unicode

This is the problem I am addressing

Donald

Anuradha Ratnaweera

unread,
Mar 27, 2006, 3:29:46 AM3/27/06
to Anti-Sinha...@googlegroups.com
On 3/27/06, Donald Gaminitilake <lankap...@gmail.com> wrote:
>
> What is the single code poiint for "DU"
> 1+2="3"
> where is the codepoint "3" for DU
>
> This codepoint (location) is not given on unicode

Great! We have solved the "code point" vs "digit" problem.... :-)

There are two different things here. Characters and graphemes.
Characters have code points, and graphemes don't. What we see as
single visual units (like "du" and "ksha") are graphemes.

One or more of code points "produce" one or more graphemes.

In the example of "du", characters "da" + "paapilla" makes the single
grapheme "du".

In latin characters one character typically produce a single grapheme
(character A producing grapheme A).

"du" is a grapheme so it doesn't have, and doesn't need to have, a
single code point.

And as your nice analogy shows "1 + 2 = 3". If 3 has a seperate code
point, then 1 and 2 are not needed at all.

Anuradha
--
http://www.linux.lk/~anuradha/
http://anuradha-ratnaweera.blogspot.com

Donald Gaminitilake

unread,
Mar 27, 2006, 7:31:05 AM3/27/06
to Anti-Sinha...@googlegroups.com
For you "DU" is not a character but a graphemes

But to render (visually display)  the 'DU" the pointer should have a absolute value and a location

A code point or "location" if these are not defined in unicode it is incorrect

Donald

On 3/27/06, Anuradha Ratnaweera <gnu.sla...@gmail.com> wrote:

Anuradha Ratnaweera

unread,
Mar 27, 2006, 7:58:36 AM3/27/06
to Anti-Sinha...@googlegroups.com
On 3/27/06, Donald Gaminitilake <lankap...@gmail.com> wrote:
>
> For you "DU" is not a character but a graphemes
>
> But to render (visually display) the 'DU" the pointer should have a
> absolute value and a location

Mr Donald, now I sincerely trust that this time you are seriously
trying to understand Sinhala Unicode, so let me try to explain the
remaining piece of the puzzle.

Let me confess something. For a few years, I also was wondering if
Sinhala Unicode is wrong exactly for this reason!

The renderer is not within the scope of the standard, but only about
representation. The standard says only this about "du":

"du" is represented by "da" followed by "papilla" (i gave the
exact numbers before).

Visually displaying, printing, keyboard format etc are NOT defined by
Unicode. It simplya says if you find "da" followed by "paapilla",
display a "du"!

This is the most interesting part of the puzzle. The displaying of
"du" is done by a shaper. Today, Indic languages are using OpenType
fonts to make the life of the shaper simple.

OpenType fonts (I am using LK-LUG font as an example) has shapes
(graphemes) with and without code points.

- Some graphemes have code points (e.g.: the shape of "da" has code point 0DAF)
- Some graphemes don't have code points (e.g.: "du")

So the font has a shape for "du", but it doesn't have a code point!
So how on earth the shaper know when to display it???

That's done by a table called GSUB.

There is a GSUB table entry saying that the shape for "du" is used for
the sequence 0DAF, 0DDF.

So, whenever the shaper see 0DAF (da) followed by 0DDF (papilla) in a
text, it checks the GSUB table, and displayes that shape for "du"
(which doesn't have a code point).

Then the shaper also does mathra reordering. "do" for example, is
written by two unicode characters: 0DAF and 0DDF, which should be
displayed as three graphemes (kombuwa, da and alapilla).

This is not just theory. It is already implemented and works on
GNU/Linux and Windows (uniscribe).

Please feel free to ask if you need any more clarifications.

Anuradha Ratnaweera

unread,
Mar 27, 2006, 8:03:07 AM3/27/06
to Anti-Sinha...@googlegroups.com
On 3/27/06, Anuradha Ratnaweera <gnu.sla...@gmail.com> wrote:
>
> This is not just theory. It is already implemented and works on
> GNU/Linux and Windows (uniscribe).

I am sorry, I forgot Mac (my brother has a Mac, but he runs GNU/Linux
on it, not OS X). Looks like Sinhala Unicode works on Mac too.
Unfortunately, it's done by a different company, not by Apple, so it
has to be purchased seperately.

http://www.xenotypetech.com/osxSinhala.html
http://www.xenotypetech.com/

However, it would be great if someone can convince apple to implement
it as a standard feature of OS X.

I hope you will venture to check out the above product and read (and
hopefully write) in Sinhala on this group and on Sinhala-Unicode
group.

Anuradha Ratnaweera

unread,
Mar 27, 2006, 8:08:18 AM3/27/06
to Anti-Sinha...@googlegroups.com
On 3/27/06, Anuradha Ratnaweera <gnu.sla...@gmail.com> wrote:
>
> OpenType fonts (I am using LK-LUG font as an example) has shapes
> (graphemes) with and without code points.

Although I used the word "grapheme" here, the most correct word is
"glyph". I just didn't want to introduce another term to complicate
the discussion.

Donald Gaminitilake

unread,
Mar 27, 2006, 1:10:04 PM3/27/06
to Anti-Sinha...@googlegroups.com
Du is not a "glyph"
It is an individual character
All individual characters need code points

the simple answer was "Gsub" was not registweed in the unicode
Some one decided on Character and "Glyph"

In the same theory why you have four ayannas registered in unicode

you could have done with one ayanna
other threee could have been "Glyphs"


SLSI 1134 is incorrect and has to be corrected

I have won

Game over

Donald
Reply all
Reply to author
Forward
0 new messages