Re: Emoji and cell phone character sets...

Asmus Freytag

unread,

Jan 9, 2009, 1:57:07 PM1/9/09

to Markus Scherer, vun...@vfemail.net, cf...@gmx.net, emoji4...@googlegroups.com

On 1/9/2009 10:31 AM, Markus Scherer wrote:
> On Fri, Jan 9, 2009 at 2:35 AM, <vun...@vfemail.net
> <mailto:vun...@vfemail.net>> wrote:
>
> As the proposal stands a number of the emoji are in fact
> duplicates of existing unicode characters - the principle of non
> duplication has not always been applied.
>
>
> It is not clear to me what exactly you are proposing to change.
>
> If you disagree with a specific unification, or you propose a specific
> new unification, then please send an email to
> emoji4...@googlegroups.com <mailto:emoji4...@googlegroups.com>
>
>
I'm sure he meant that only as an example in the discussion, but here it
is, formally forwarded, you can extract the text between ----

--------------------
Please provide a rationale for the apparently inconsistent treatment of
e-B45 vs e-B53

"e-B45
U+1F4FD CROSS MARK
Temporary Notes: bad; NO GOOD, not approved; X in tic tac toe.
Tentatively disunified from U+2715"

vs

"e-B53
U+2716 HEAVY MULTIPLICATION X
Temporary Notes: Unified with U+2716"

Why is the latter unified and the former is not?

-----------------

Markus Scherer

unread,

Jan 9, 2009, 2:12:52 PM1/9/09

to Asmus Freytag, vun...@vfemail.net, cf...@gmx.net, emoji4...@googlegroups.com

On Fri, Jan 9, 2009 at 10:57 AM, Asmus Freytag <asm...@ix.netcom.com> wrote:

I'm sure he meant that only as an example in the discussion, but here it is, formally forwarded, you can extract the text between ----

--------------------
Please provide a rationale for the apparently inconsistent treatment of e-B45 vs e-B53

What I was really hoping for was "Please unify e-B45 with U+2715" or "Please disunify e-B53 from U+2716".

But to answer your question --

"e-B45
U+1F4FD CROSS MARK
Temporary Notes: bad; NO GOOD, not approved; X in tic tac toe. Tentatively disunified from U+2715"

vs

"e-B53
U+2716 HEAVY MULTIPLICATION X
Temporary Notes: Unified with U+2716"

Why is the latter unified and the former is not?

KDDI has a set of basic math operators with "heavy" looking images. It's natural to unify one with the existing HEAVY MULTIPLICATION X (and add the others in the same block).

For e-B45 vs. U+2715, I don't quite remember. I do remember that we looked at all the x's and cross marks and decided to leave this one disunified, but I don't remember specifically. (For what it's worth, the best record of this one seems to be in project issue 32.)

Looking at it anew, it seems like e-B45 wants to have a heavier glyph than U+2715, but there appears to be no technical reason (such as source separation) that we couldn't unify.

What do you think?

markus

Asmus Freytag

unread,

Jan 9, 2009, 2:49:22 PM1/9/09

to Markus Scherer, vun...@vfemail.net, cf...@gmx.net, emoji4...@googlegroups.com

On 1/9/2009 11:12 AM, Markus Scherer wrote:

> On Fri, Jan 9, 2009 at 10:57 AM, Asmus Freytag <asm...@ix.netcom.com
> <mailto:asm...@ix.netcom.com>> wrote:
>
> I'm sure he meant that only as an example in the discussion, but
> here it is, formally forwarded, you can extract the text between ----
>
> --------------------
> Please provide a rationale for the apparently inconsistent
> treatment of e-B45 vs e-B53
>
>
> What I was really hoping for was "Please unify e-B45 with U+2715" or
> "Please disunify e-B53 from U+2716".
>
> But to answer your question --
>
> "e-B45
> U+1F4FD CROSS MARK
> Temporary Notes: bad; NO GOOD, not approved; X in tic tac toe.
> Tentatively disunified from U+2715"
>
> vs
>
> "e-B53
> U+2716 HEAVY MULTIPLICATION X
> Temporary Notes: Unified with U+2716"
>
> Why is the latter unified and the former is not?
>
>
> KDDI has a set of basic math operators with "heavy" looking images.
> It's natural to unify one with the existing HEAVY MULTIPLICATION X
> (and add the others in the same block).

See this is where this gets difficult. If these are really
*mathematical* symbols, please look at 2A2F as well and give a rationale
why this isn't the proper unification target. (The length of the arms
seems to be different, with 2A2F shorter and intended to match the
Latin-1 multiplication operator)

>
> For e-B45 vs. U+2715, I don't quite remember. I do remember that we
> looked at all the x's and cross marks and decided to leave this one
> disunified, but I don't remember specifically. (For what it's worth,
> the best record of this one seems to be in project issue 32

> <http://code.google.com/p/emoji4unicode/issues/detail?id=32>.)
That contains no further info than what's in the comment.

>
> Looking at it anew, it seems like e-B45 wants to have a heavier glyph
> than U+2715, but there appears to be no technical reason (such as
> source separation) that we couldn't unify.

If you end up using the KDDI symbol with VECTOR PRODUCT, the 2716 would
be free for this, but I suspect what might really be meant for "NO GOOD"
is 2718, which differs in that it's intended to look hand-written. To
really properly decide unification you need to be able to assert that
the 'sans-serif' nature of this thing is an essential characteristic (or
not).

Hope this helps investigate this further.

A./

Markus Scherer

unread,

Jan 9, 2009, 3:48:41 PM1/9/09

to Asmus Freytag, vun...@vfemail.net, cf...@gmx.net, emoji4...@googlegroups.com

On Fri, Jan 9, 2009 at 11:49 AM, Asmus Freytag <asm...@ix.netcom.com> wrote:

KDDI has a set of basic math operators with "heavy" looking images. It's natural to unify one with the existing HEAVY MULTIPLICATION X (and add the others in the same block).

See this is where this gets difficult. If these are really *mathematical* symbols, please look at 2A2F as well and give a rationale why this isn't the proper unification target. (The length of the arms seems to be different, with 2A2F shorter and intended to match the Latin-1 multiplication operator)

I don't think we found U+2A2F VECTOR OR CROSS PRODUCT when we looked at these. Why does Unicode have that in addition to U+00D7 MULTIPLICATION SIGN?

The KDDI operators fill their square bounding boxes and look "heavy" which for me is a better match for U+2716 HEAVY MULTIPLICATION X.

For e-B45 vs. U+2715, I don't quite remember. I do remember that we looked at all the x's and cross marks and decided to leave this one disunified, but I don't remember specifically. (For what it's worth, the best record of this one seems to be in project issue 32 <http://code.google.com/p/emoji4unicode/issues/detail?id=32>.)

That contains no further info than what's in the comment.

Looking at it anew, it seems like e-B45 wants to have a heavier glyph than U+2715, but there appears to be no technical reason (such as source separation) that we couldn't unify.

If you end up using the KDDI symbol with VECTOR PRODUCT, the 2716 would be free for this, but I suspect what might really be meant for "NO GOOD" is 2718, which differs in that it's intended to look hand-written. To really properly decide unification you need to be able to assert that the 'sans-serif' nature of this thing is an essential characteristic (or not).

We rejected this once before based on KDDI's and SoftBank's shapes for e-B45, but I will ask Kat again if he thinks the handwriting-looking U+2718 HEAVY BALLOT X would be ok.

Hope this helps investigate this further.

It does, by providing more data points, insights and suggestions. I filed project issue 76 for these.

Thanks,
markus

vun...@vfemail.net

unread,

Jan 9, 2009, 3:25:37 PM1/9/09

to Asmus Freytag, Markus Scherer, cf...@gmx.net, emoji4...@googlegroups.com

Quoting "Asmus Freytag" <asm...@ix.netcom.com>:

> On 1/9/2009 10:31 AM, Markus Scherer wrote:
>> On Fri, Jan 9, 2009 at 2:35 AM, <vun...@vfemail.net
>> <mailto:vun...@vfemail.net>> wrote:
>>
>> As the proposal stands a number of the emoji are in fact
>> duplicates of existing unicode characters - the principle of non
>> duplication has not always been applied.
>>
>>
>> It is not clear to me what exactly you are proposing to change.
>>

IMHO the best approach is if in doubt unify, here the case is recorded
as 'tentatively disunified" . For a number of emoji no unification
issues exist, take for example a Hamburger, there are clearly no
encoded hamburgers. Being different from an encoded Dingbat is IMHO
essential.

If one has an approach if in doubt disunify ones conclusion will be
very different. IMHO this would be a very damaging approach to take

One question then is what unification principles should be applied?

Another question this raises is "Is the proposal mature and well thought out?"

John Knightley

Asmus Freytag

unread,

Jan 9, 2009, 4:39:03 PM1/9/09

to Markus Scherer, vun...@vfemail.net, cf...@gmx.net, emoji4...@googlegroups.com

On 1/9/2009 12:48 PM, Markus Scherer wrote:
> On Fri, Jan 9, 2009 at 11:49 AM, Asmus Freytag <asm...@ix.netcom.com
> <mailto:asm...@ix.netcom.com>> wrote:
>
> KDDI has a set of basic math operators with "heavy" looking
> images. It's natural to unify one with the existing HEAVY
> MULTIPLICATION X (and add the others in the same block).
>
> See this is where this gets difficult. If these are really
> *mathematical* symbols, please look at 2A2F as well and give a
> rationale why this isn't the proper unification target. (The
> length of the arms seems to be different, with 2A2F shorter and
> intended to match the Latin-1 multiplication operator)
>
>
> I don't think we found U+2A2F VECTOR OR CROSS PRODUCT when we looked
> at these. Why does Unicode have that in addition to U+00D7
> MULTIPLICATION SIGN?

My best (quick) guess is because a) the 2700 block characters are
dingbats, and/or b) the glyph design is only approximately correct for
math use, whereas the glyph design for 2700 block dingbats should match
the Zapf design.

>
> The KDDI operators fill their square bounding boxes and look "heavy"
> which for me is a better match for U+2716 HEAVY MULTIPLICATION X.

You've got to decide whether these are mathematical, or whether these
are something else. I suspect they are something else, because I don't
see people writing "cross product" on their phone, but you might check
with the physics department at the University of Kyoto (I can give you a
contact :-) ).

>
>
>
> For e-B45 vs. U+2715, I don't quite remember. I do remember
> that we looked at all the x's and cross marks and decided to
> leave this one disunified, but I don't remember specifically.
> (For what it's worth, the best record of this one seems to be
> in project issue 32
> <http://code.google.com/p/emoji4unicode/issues/detail?id=32>.)
>
> That contains no further info than what's in the comment.
>
>
> Looking at it anew, it seems like e-B45 wants to have a
> heavier glyph than U+2715, but there appears to be no
> technical reason (such as source separation) that we couldn't
> unify.
>
> If you end up using the KDDI symbol with VECTOR PRODUCT, the 2716
> would be free for this, but I suspect what might really be meant
> for "NO GOOD" is 2718, which differs in that it's intended to look
> hand-written. To really properly decide unification you need to be
> able to assert that the 'sans-serif' nature of this thing is an
> essential characteristic (or not).
>
>
> We rejected this once before based on KDDI's and SoftBank's shapes for
> e-B45, but I will ask Kat again if he thinks the handwriting-looking
> U+2718 HEAVY BALLOT X would be ok.
>
> Hope this helps investigate this further.
>
>
> It does, by providing more data points, insights and suggestions. I
> filed project issue 76

> <http://code.google.com/p/emoji4unicode/issues/detail?id=76> for these.
>
> Thanks,
> markus
>

vun...@vfemail.net

unread,

Jan 9, 2009, 6:05:18 PM1/9/09

to Markus Scherer, Asmus Freytag, cf...@gmx.net, emoji4...@googlegroups.com

Quoting "Markus Scherer" <marku...@gmail.com>:

> issue 76 <http://code.google.com/p/emoji4unicode/issues/detail?id=76> for
> these.
>

Thank-you Markus for filing a report on this one.

John Knightley

>
> Thanks,
> markus
>

Michael Everson

unread,

Jan 9, 2009, 6:36:10 PM1/9/09

to emoji4...@googlegroups.com, sym...@unicode.org

On 9 Jan 2009, at 18:31, Markus Scherer wrote:

> If you disagree with a specific unification, or you propose a
> specific new unification, then please send an email to emoji4...@googlegroups.com

Why is that different from sym...@unicode.org?

I have a lot of issues with the character set, but the development
process the proposal makes use of is most difficult.

I ask YOU, Markus, to respond specifically to my request to have the
Katakana transliterated so that the native designations can be
evaluated.

Your team-member Mark Davis suggested that I use Google's machine
translation to do this. I am not interested in doing so. I want your
project team to provide the transliteration in response to my request.

Will you do this.

Michael Everson * http://www.evertype.com

Markus Scherer

unread,

Jan 9, 2009, 7:00:35 PM1/9/09

to emoji4...@googlegroups.com, sym...@unicode.org

On Fri, Jan 9, 2009 at 3:36 PM, Michael Everson <eve...@evertype.com> wrote:

On 9 Jan 2009, at 18:31, Markus Scherer wrote:
> If you disagree with a specific unification, or you propose a
> specific new unification, then please send an email to emoji4...@googlegroups.com

Why is that different from sym...@unicode.org?

We have a public list so that not just UTC members can submit feedback.

I have a lot of issues with the character set, but the development
process the proposal makes use of is most difficult.

I ask YOU, Markus, to respond specifically to my request to have the
Katakana transliterated so that the native designations can be
evaluated.

I can work on transliteration of Katakana and Hiragana, but I personally cannot provide translations from Japanese, and I cannot provide the back-stories for many of the symbols that someone who is from, or has lived in, Japan can provide. For that, I myself have to ask colleagues and others involved.

If someone wants to provide translations and explanations, I can try to work those into the data and into the chart.

Best regards,
markus

Asmus Freytag

unread,

Jan 9, 2009, 7:09:04 PM1/9/09

to Markus Scherer, emoji4...@googlegroups.com, sym...@unicode.org

On 1/9/2009 4:00 PM, Markus Scherer wrote:
>
> I ask YOU, Markus, to respond specifically to my request to have the
> Katakana transliterated so that the native designations can be
> evaluated.
>
>
> I can work on transliteration of Katakana and Hiragana, but I
> personally cannot provide translations from Japanese, and I cannot
> provide the back-stories for many of the symbols that someone who is
> from, or has lived in, Japan can provide. For that, I myself have to
> ask colleagues and others involved.
>
> If someone wants to provide translations and explanations, I can try
> to work those into the data and into the chart.

Say, that's a decent offer, Markus. Compared to other sets of
miscellaneous characters that have come from Japan over time, there's
already quite a bit more of information on background and usage than
average. Keep up the good work.

A./

Michael Everson

unread,

Jan 9, 2009, 8:37:33 PM1/9/09

to Markus Scherer, emoji4...@googlegroups.com, sym...@unicode.org

On 10 Jan 2009, at 00:00, Markus Scherer wrote:

> We have a public list so that not just UTC members can submit
> feedback.

So, I can send mail to it. Am I subscribed to it?

> I have a lot of issues with the character set, but the development
> process the proposal makes use of is most difficult.

The way you (pl.) are managing the development process seems to me to
be problematic.

For instance, I've asked N numbers of times, "who's making the font".
Now I hear it is "Apple". But who? Peter Lofting?

> I ask YOU, Markus, to respond specifically to my request to have the
> Katakana transliterated so that the native designations can be
> evaluated.
>
> I can work on transliteration of Katakana and Hiragana, but I
> personally cannot provide translations from Japanese,

Transliterations are likely to be sufficient.

> and I cannot provide the back-stories for many of the symbols that
> someone who is from, or has lived in, Japan can provide. For that, I
> myself have to ask colleagues and others involved.

The process seems very exclusive.

Christopher Fynn

unread,

Jan 11, 2009, 4:31:56 AM1/11/09

to emoji4unicode, unicode, pete...@microsoft.com

On 11/01/2009, Peter Constable <pete...@microsoft.com> wrote:

>> BTW This is not hypothetical, there are already cell phones available in
>> Tibet which use a pre-composed Tibetan character set:
>> <http://www.actapress.com/PaperInfo.aspx?PaperID=30325>.

> And all of those pre-composed elements can be represented using existing
> Unicode characters with reliable round-trip-ability. So, we won't be needing
> to encode those as separate characters in Unicode (just in case anybody was
> wondering).

Although it is fairly straightforward, I don't know if anyone like
Google, MS or the Chinese telecom companies has actually implemented
a conversion between pre-composed Tibetan and Unicode - until they do,
round-tripping is hypothetical. The Pre-composed Tibetan standard
GB/T20524-2006 also uses a PUA character encoding ~ so the objection
that has been raised wrt using a PUA encoding for emoji, i.e. How
to determine which PUA convention is being used, applies here too.

- Chris

> Peter

Markus Scherer

unread,

Jan 12, 2009, 7:17:31 PM1/12/09

to emoji4...@googlegroups.com, sym...@unicode.org

On Fri, Jan 9, 2009 at 4:00 PM, Markus Scherer <marku...@gmail.com> wrote:

On Fri, Jan 9, 2009 at 3:36 PM, Michael Everson <eve...@evertype.com> wrote:

I ask YOU, Markus, to respond specifically to my request to have the
Katakana transliterated so that the native designations can be
evaluated.

I can work on transliteration of Katakana and Hiragana, [...]

Done: Search for "「kuroobaa」" (and try to pronounce like "clover") in http://www.unicode.org/~scherer/emoji4unicode/snapshot/utc.html

Project issue 78 moved to Fixed: http://code.google.com/p/emoji4unicode/issues/detail?id=78

markus

Reply all

Reply to author

Forward