UTF-8 variant selector + Mac OS X: characters cut off

38 views
Skip to first unread message

Dennis Preiser

unread,
Jul 12, 2015, 4:48:20 PM7/12/15
to fltkg...@googlegroups.com
Hi,

during test of an fltk-based Newsreader (flnews by Michael Bäuerle) in
de.alt.test we encountered a strange issue. In combination with special
UTF-8 selectors some characters were not properly drawn on Mac OS X.

This issue can be reproduced with the shipped example editor in test
(test/editor). Here is a file which triggers this issue:

<http://d--p.de/tmp/fltk_anomaly.txt>

The first line begins with a warning sign (U+26A0) followed ba a variant
selector (U+FE0F). If you load the file in text/editor, the last character
("Z") is cut off:

<http://d--p.de/tmp/2015-07-12_fltk-editor.png>

I've applied this patch:

<http://d--p.de/tmp/unittest_text.diff>

--- test/unittest_text_orig.cxx 2011-07-19 06:49:30.000000000 +0200
+++ test/unittest_text.cxx 2015-07-12 12:21:03.000000000 +0200
@@ -60,7 +60,7 @@ public:
fl_font(FL_HELVETICA, 30);
int xx = x0+55;
int yy = y0+40;
- DrawTextAndBoxes("!abcdeABCDE\"#A", xx, yy); yy += 50; // mixed string
+ DrawTextAndBoxes("\xE2\x9A\xA0\xEF\xB8\x8F ABCDEFGHIJKLMNOPQRSTUVWXYZ", xx, yy); yy += 50;
+ // mixed string
DrawTextAndBoxes("oacs", xx, yy); xx += 100; // small glyphs
DrawTextAndBoxes("qjgIPT", xx, yy); yy += 50; xx -= 100; // glyphs with descenders
DrawTextAndBoxes("````````", xx, yy); yy += 50; // high small glyphs

to test/unittest_text.cxx to see whether the text runs out of the bounding
box:

<http://d--p.de/tmp/2015-07-12_fltk-unittests.png>

but this seems to be ok. Can someone else reproduce this issue with
Mac OS X?

(Screenshots made with fltk-1.3.x-r10783, fltk-1.3.3 looks similar.)

Dennis

manol...@gmail.com

unread,
Jul 13, 2015, 5:54:28 PM7/13/15
to fltkg...@googlegroups.com, d_...@d--p.de
I see this problem with the editor program and your string on Mac OS X.

The problem arises because the width of the warning character computed by FLTK,
which determines the advancement of the cursor when moving along the line with arrow keys,
does not match the displayed character width: the cursor does not move completely to the
right end of the warning glyph.

FLTK computes the widths of each character it draws, and memorizes them, and computes
string widths by summing the width of each character. That is how it achieves very fast
drawing of large text quantities.

FLTK draws lines in a single command in the editor program. It turns out that the width
of the drawn line is bigger than the sum of the widths of the characters composing
the line, when the special warning character occurs. Moreover, FLTK clips drawing
to the width of the string it computed. The effect is that the end of the line is truncated.

FLTK uses the width of the full line when computing text bounding boxes. This explain why
the unit_test does not show the problem.

That is the analysis of the cause of the problem. But I have no solution at this point.

File src/fl_font_mac.cxx is where the relevant code is.

manol...@gmail.com

unread,
Jul 14, 2015, 12:46:03 PM7/14/15
to fltkg...@googlegroups.com, d_...@d--p.de


On Sunday, 12 July 2015 22:48:20 UTC+2, Dennis Preiser wrote:
Hi,

during test of an fltk-based Newsreader (flnews by Michael Bäuerle) in
de.alt.test we encountered a strange issue. In combination with special
UTF-8 selectors some characters were not properly drawn on Mac OS X.

This issue can be reproduced with the shipped example editor in test
(test/editor). Here is a file which triggers this issue:

<http://d--p.de/tmp/fltk_anomaly.txt>

The first line begins with a warning sign (U+26A0) followed ba a variant
selector (U+FE0F).

If you remove the variant selector, the text is drawn correctly by
FLTK. The warning sign has the text color instead
of being yellow, and its width is correctly computed by FLTK (cursor moving test).
The variant selector makes FLTK draw it with (presumably) another
font for a larger glyph, and the glyph width does not match the width
computed by FLTK who sees only the warning character (without
the variant selector) when it compute widths.

What is the purpose of the variant selector?

I notice that TextWrangler draws your string with correct widths, but ignoring the
variant selector since it shows the same glyph as FLTK without variant
selector.

Would it be acceptable for FLTK to skip variant selectors when it encounters them
since its width computation cannot account for them? This can be coded easily,
and would have FLTK do as TextWrangler (at least on your example).


Dennis Preiser

unread,
Jul 14, 2015, 4:24:49 PM7/14/15
to fltkg...@googlegroups.com
manolo.gouy wrote:
> On Sunday, 12 July 2015 22:48:20 UTC+2, Dennis Preiser wrote:
>> The first line begins with a warning sign (U+26A0) followed ba a variant
>> selector (U+FE0F).
>
> If you remove the variant selector, the text is drawn correctly by
> FLTK. The warning sign has the text color instead of being yellow, and
> its width is correctly computed by FLTK (cursor moving test). The
> variant selector makes FLTK draw it with (presumably) another font for
> a larger glyph, and the glyph width does not match the width computed
> by FLTK who sees only the warning character (without the variant
> selector) when it compute widths.
>
> What is the purpose of the variant selector?

That's exactly what the selector should do: choose between text-style
and emoji-style.

There are other selectors, for instance to modify the skin tone of
emojis:

<http://emojipedia.org/skin-tone-modifiers/>

> I notice that TextWrangler draws your string with correct widths, but
> ignoring the variant selector since it shows the same glyph as FLTK
> without variant selector.

I guess that TextWrangler simply ignores the variant selector.

> Would it be acceptable for FLTK to skip variant selectors when it
> encounters them since its width computation cannot account for them?

Personally, I'd prefer it if fltk would be able to deal with the
selectors in the intended manner. Unfortunately, I cannot provide a
patch.

Dennis

manol...@gmail.com

unread,
Jul 14, 2015, 5:49:06 PM7/14/15
to fltkg...@googlegroups.com, d_...@d--p.de


On Tuesday, 14 July 2015 22:24:49 UTC+2, Dennis Preiser wrote:
> Would it be acceptable for FLTK to skip variant selectors when it
> encounters them since its width computation cannot account for them?

Personally, I'd prefer it if fltk would be able to deal with the
selectors in the intended manner. Unfortunately, I cannot provide a
patch.

Dennis

 Could you, please, try the attached patch that would make FLTK behave as TextWrangler,
and report whether that's an acceptable solution.
variant.patch

Michael Bäuerle

unread,
Jul 15, 2015, 4:32:36 AM7/15/15
to fltkg...@googlegroups.com
manol...@gmail.com wrote:
>
> [warning sign with emoji style]
> The variant selector makes FLTK draw it with (presumably) another
> font for a larger glyph, and the glyph width does not match the width
> computed by FLTK who sees only the warning character (without
> the variant selector) when it compute widths.

Question from me as X11 user:
A problem that at least looks similar occur with the X11 glyph
substitution patches from Ian (end of lines truncated):
<http://www.fltk.org/str.php?L1903>
But in this case the whole bounding box is too small if multiple fonts
are used.

@Ian:
In September you have written that you know a way how to fix this.
Would the fix in your mind also catch this issue or would variation
selectors be a problem in general on X11 too?

I ask because currently (in the unpatched X11 code) the variation
selector is handled wrong - not detected at all and print as a separate
replacement character - this looks ugly:
<http://micha.freeshell.org/tmp/2015-07-06_flnews_variation_selector.png>
If you think that this will never work as intended on X11 (even with a
future version of your glyph substitution patch), I would try to create
a patch that detect and skip the variation selectors completetly (do
nothing, print nothing) for X11.

MacArthur, Ian (Selex ES, UK)

unread,
Jul 15, 2015, 6:35:09 AM7/15/15
to fltkg...@googlegroups.com
> Question from me as X11 user:
> A problem that at least looks similar occur with the X11 glyph
> substitution patches from Ian (end of lines truncated):
> <http://www.fltk.org/str.php?L1903>
> But in this case the whole bounding box is too small if multiple fonts
> are used.

Hi Michael,

Well, that's an interesting question; as I understand it, the problems are exhibiting similar "features", but the underlying causes are somewhat different, but perhaps related.

I'll try to describe what I *think* the problems are; others can then add corrections...

OSX:

In the OSX case the text editor truncates the string when it is rendered, but the unittest bounding box looks correct (indeed probably *is* correct.)

This happens because fltk computes the text editor clip region by summing the individual glyph widths, but this summation is wrong in this case because fltk does not account for the variant selector, and so computes the "default" width for the varied glyph, rather than the actual width of the glyph as it is rendered.

The unittest bounding box is computed by measuring the "inked" region of the whole string, as that string is created by the OSX font system, and this *does* account for the variant glyph selector substitution etc., since OSX font system supports the variant selector.

So we see the string clipped in the text editor, but the unittest renders the correct bounding boxes.

We could potentially move to drop the "per glyph width summation" for text rendering in fltk, and always use the sizes of whole strings, as reported by the OSX font system, but that might be quite a big change internally, and may well be slower (or, it might be faster on modern font systems? I do not have any metrics for this...)


X11:

The X11 case Michael refers to, using my hack X11/XFT glyph substitution scheme, shows a *similar* fault, but that affects both the text editor and the unittest.

To try and explain why...

What my hack is doing is trying to substitute any glyphs that are missing from fltk's current font, by looking for the missing glyph in a (user specified) set of alternate fonts. (To use it, the user selects the fltk current font in the usual way, but also loads a set of optional fallback fonts to search for missing glyphs.)

In practice, this works fairly well; most strings are rendered in the current font, but any individual words in the string that that have problem glyphs, are rendered in whichever of the "fallback" fonts contains the majority (ideally all) of the problem glyphs.

But the hack is incomplete - the glyph substitution I describe happens "correctly" when the string is rendered, but with the present hack, the computation of widths (either on a per-glyph basis as done by the text editor, or on a per-string basis as done by the unittest) are always done using just the fltk current font (as the stock fltk font engine does.)

This happens simply because (so far) I have never made the necessary changes to use the hack substitution model in all cases, but only in the actual rendering pass. As a result the string *as rendered* tends to be a different size from the string *as measured*, if any glyphs need to be substituted (but are mostly fine if no glyphs are substituted, or at least no worse than stock fltk!)

Again, reworking fltk to always try and measure what would actually be rendered is likely to be (at least a part of) the solution. But how we get there might be somewhat different on the two platforms.

And I have no thoughts on what the WinXX port will do... I suspect it will behave like OSX, if it even honours the variant glyph selector substitution at all... Hmm, OK, looks like this Win7 box does not even honour the variant selector and I just get a nasty replacement glyph...



> @Ian:
> In September you have written that you know a way how to fix this.
> Would the fix in your mind also catch this issue or would variation
> selectors be a problem in general on X11 too?

The specific issue with my X11 hack, as noted above, is that the code currently measures "the wrong string"; the "fix" for my hack, that I alluded to back in September, was to always pass strings through my substitution process, both when rendering and measuring them (though this might often entail effectively running substitution on each string twice in some cases.) Note that Oksid's old Xlib substitution scheme did basically that, though in a way that is awkward for me to use with my XFT hack...

I have not done this because it will surely have a performance impact, will involve refactoring a fair bit of code, and I just never got around to it.

Also, I do wonder if the more invasive "fix" of reworking fltk to drop the per-glyph width summation and just measuring the whole (substituted) string might actually end up being faster in the future, as we need to render more complex texts...

As regards variant glyph selector substitution, I am not sure to what extent X11/XFT really understands that, or if we need to do something in fltk to make it work.

I suspect it does not really work at all at present (as you report below...)

>
> I ask because currently (in the unpatched X11 code) the variation
> selector is handled wrong - not detected at all and print as a separate
> replacement character - this looks ugly:
> <http://micha.freeshell.org/tmp/2015-07-
> 06_flnews_variation_selector.png>
> If you think that this will never work as intended on X11 (even with a
> future version of your glyph substitution patch), I would try to create
> a patch that detect and skip the variation selectors completetly (do
> nothing, print nothing) for X11.

I suspect that in the short term we should probably patch the stock fltk to elide the variation selectors from any string we process.
I think this is more or less what Manolo is proposing as the workaround for OSX anyway?

Longer term it would be nice to do something cleverer. I do nto really know what though. It may be that switching to measuring (and rendering) "whole strings" rather than "per glyph", may prove to be the better option, but that may be a big change in the way fltk handles text rendering in general.


Or I could be talking rubbish again...
--
Ian





Selex ES Ltd
Registered Office: Sigma House, Christopher Martin Road, Basildon, Essex SS14 3EL
A company registered in England & Wales. Company no. 02426132
********************************************************************
This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender.
You should not copy it or use it for any purpose nor disclose or
distribute its contents to any other person.
********************************************************************

Dennis Preiser

unread,
Jul 15, 2015, 11:57:34 AM7/15/15
to fltkg...@googlegroups.com
manol...@gmail.com wrote:
> On Tuesday, 14 July 2015 22:24:49 UTC+2, Dennis Preiser wrote:
>>
>> > Would it be acceptable for FLTK to skip variant selectors when it
>> > encounters them since its width computation cannot account for them?
>>
>> Personally, I'd prefer it if fltk would be able to deal with the
>> selectors in the intended manner. Unfortunately, I cannot provide a
>> patch.
>
> Could you, please, try the attached patch that would make FLTK behave as
> TextWrangler,

I can confirm that FLTK ignores the variant selector. Here is what I
get now:

<http://d--p.de/tmp/2015-07-15_fltk_variant_patch.png>

> and report whether that's an acceptable solution.

I think that's an acceptable solution for the time being.

Dennis

manol...@gmail.com

unread,
Jul 15, 2015, 5:59:58 PM7/15/15
to fltkg...@googlegroups.com, d_...@d--p.de


On Wednesday, 15 July 2015 17:57:34 UTC+2, Dennis Preiser wrote:
>  Could you, please, try the attached patch that would make FLTK behave as
> TextWrangler,

I can confirm that FLTK ignores the variant selector. Here is what I
get now:

<http://d--p.de/tmp/2015-07-15_fltk_variant_patch.png>

> and report whether that's an acceptable solution.

I think that's an acceptable solution for the time being.

Dennis

Good. This solution is now in the svn repository, to appear with FLTK 1.3.4
 

MacArthur, Ian (Selex ES, UK)

unread,
Jul 16, 2015, 5:23:47 AM7/16/15
to fltkg...@googlegroups.com
> > > and report whether that's an acceptable solution.

> > I think that's an acceptable solution for the time being.


> Good. This solution is now in the svn repository, to appear with FLTK 1.3.4


OK... Do we want/need to apply the analogous workaround (the eliding of variant selector codes from Unicode strings) to the X11/XFT/Xlib and WinXX ports too?

I'm thinking we probably do: Michael B reports that his X11 tests behave poorly in the presence of variant selectors, and testing here on Win7 with the sample text that Dennis proposed showed aberrant behaviour...

Opinions?

Manolo Gouy

unread,
Jul 16, 2015, 10:28:56 AM7/16/15
to fltkg...@googlegroups.com

> Le 16 juil. 2015 à 11:23, MacArthur, Ian wrote:
>
>>>> and report whether that's an acceptable solution.
>
>>> I think that's an acceptable solution for the time being.
>
>
>> Good. This solution is now in the svn repository, to appear with FLTK 1.3.4
>
>
> OK... Do we want/need to apply the analogous workaround (the eliding of variant selector codes from Unicode strings) to the X11/XFT/Xlib and WinXX ports too?
>
> I'm thinking we probably do: Michael B reports that his X11 tests behave poorly in the presence of variant selectors, and testing here on Win7 with the sample text that Dennis proposed showed aberrant behaviour...
>
> Opinions?
>
> --
> Ian

I concur with that proposal (note that at this point I only experienced with the Mac platform).
This is what is now in the Mac source code:
All text drawing fl_draw() functions call a common function that converts the utf-8 encoded text argument
into a temporary UTF16 string. All unicode points in the [0xFE00-0xFE0F] range are removed from this string
which is then sent to the Mac OS text drawing API.

MacArthur, Ian (Selex ES, UK)

unread,
Jul 16, 2015, 10:33:09 AM7/16/15
to fltkg...@googlegroups.com
> This is what is now in the Mac source code:
> All text drawing fl_draw() functions call a common function that
> converts the utf-8 encoded text argument
> into a temporary UTF16 string. All unicode points in the [0xFE00-
> 0xFE0F] range are removed from this string
> which is then sent to the Mac OS text drawing API.

Yes, I was wondering... do we also need to adjust the OSX code for fl_measure(), fl_text_extents(), etc., to also deal with eliding the variant selector codes?

As it stands, we may not be measuring what we print, I think?


That apart from making the equivalent changes in X11 and WIN32 codes too...

Manolo Gouy

unread,
Jul 16, 2015, 11:23:32 AM7/16/15
to fltkg...@googlegroups.com

> Le 16 juil. 2015 à 16:33, MacArthur, Ian wrote:
>
>> This is what is now in the Mac source code:
>> All text drawing fl_draw() functions call a common function that
>> converts the utf-8 encoded text argument
>> into a temporary UTF16 string. All unicode points in the [0xFE00-
>> 0xFE0F] range are removed from this string
>> which is then sent to the Mac OS text drawing API.
>
> Yes, I was wondering... do we also need to adjust the OSX code for fl_measure(), fl_text_extents(), etc., to also deal with eliding the variant selector codes?
>
> As it stands, we may not be measuring what we print, I think?
>
>
> That apart from making the equivalent changes in X11 and WIN32 codes too...

Last minute news: I have now found how to correctly handle 'variation selectors'
for the Mac platform of FLTK, and committed it (r. 10792).
Other platforms can skip them if it’s not possible to handle them.

Dennis Preiser

unread,
Jul 16, 2015, 12:42:53 PM7/16/15
to fltkg...@googlegroups.com
Manolo Gouy <manol...@univ-lyon1.fr> wrote:
> Last minute news: I have now found how to correctly handle 'variation selectors'
> for the Mac platform of FLTK, and committed it (r. 10792).
> Other platforms can skip them if it’s not possible to handle them.

This seems to solve the issue completely. (We had another issue
regarding this variant selector, a lost character within a word, not
reported yet. This is also fixed.) Thanks :)

The editor with your fix:

<http://d--p.de/tmp/2015-07-16_fltk_10792.png>

Dennis

Michael Bäuerle

unread,
Jul 16, 2015, 1:16:57 PM7/16/15
to fltkg...@googlegroups.com
Manolo Gouy wrote:
> MacArthur, Ian wrote:
> >
> > OK... Do we want/need to apply the analogous workaround (the eliding
> > of variant selector codes from Unicode strings) to the X11/XFT/Xlib
> > and WinXX ports too?
> >
> > I’m thinking we probably do: Michael B reports that his X11 tests
> > behave poorly in the presence of variant selectors, and testing
> > here on Win7 with the sample text that Dennis proposed showed
> > aberrant behaviour...
> >
> > Opinions?
>
> I concur with that proposal (note that at this point I only
> experienced with the Mac platform). This is what is now in the Mac
> source code:
> All text drawing fl_draw() functions call a common function that
> converts the utf-8 encoded text argument into a temporary UTF16
> string. All unicode points in the [0xFE00-0xFE0F] range are removed
> from this string which is then sent to the Mac OS text drawing API.

We should remember here, that the variation selectors in the BMP listed
above are not the only ones. There are more variation selectors, at
least:

In the SMP for skin tone: U+1F3FB..U+1F3FF
In the SSP for CJK style: U+E0100..U+E01EF

If the font rendering engine don't have support for styles, I think
we should map all of them out.

Albrecht Schlosser

unread,
Jul 17, 2015, 9:25:14 AM7/17/15
to fltkg...@googlegroups.com
On 16.07.2015 19:16 Michael Bäuerle wrote:

> We should remember here, that the variation selectors in the BMP listed
> above are not the only ones. There are more variation selectors, at
> least:
>
> In the SMP for skin tone: U+1F3FB..U+1F3FF
> In the SSP for CJK style: U+E0100..U+E01EF
>
> If the font rendering engine don't have support for styles, I think
> we should map all of them out.

Maybe these are all non-spacing characters? If yes, they /should/ return
true when checked with fl_nonspacing().

However, I don't find them in the lists of fl_nonspacing(), aka
XUtf8IsNonSpacing(). And, BTW, fl_nonspacing() returns `unsigned short'
which would not be sufficient for the variation selectors mentioned
above (the convention seems to be to return either 0 or the character
(code point)). Wondering...

Another note: the code committed recently for Mac OS seems to convert
the UTF-8 string to UTF-16, which would result in surrogate pairs (two
short's) for the code points above. Do we support these, or not? What if
they appear in user text strings?

Sorry, I don't have answers and can't check right now. Maybe others can
chime in...

Manolo Gouy

unread,
Jul 17, 2015, 1:39:26 PM7/17/15
to fltkg...@googlegroups.com

> Le 17 juil. 2015 à 15:25, Albrecht Schlosser wrote :
>
>
> Another note: the code committed recently for Mac OS seems to convert the UTF-8 string to UTF-16, which would result in surrogate pairs (two short's) for the code points above. Do we support these, or not? What if they appear in user text strings?

UTF-8 text that involves surrogate pairs if converted to UTF-16 is correctly handled on the Mac platform of FLTK.

MacArthur, Ian (Selex ES, UK)

unread,
Jul 20, 2015, 5:34:24 AM7/20/15
to fltkg...@googlegroups.com
>
> > We should remember here, that the variation selectors in the BMP
> listed
> > above are not the only ones. There are more variation selectors, at
> > least:
> >
> > In the SMP for skin tone: U+1F3FB..U+1F3FF
> > In the SSP for CJK style: U+E0100..U+E01EF
> >
> > If the font rendering engine don't have support for styles, I think
> > we should map all of them out.
>
> Maybe these are all non-spacing characters? If yes, they /should/
> return
> true when checked with fl_nonspacing().

Well, yes, they are non-spacing characters (in fltk terms), but the thing that might make things a bit more awkward here is that a "normal" non-spacing glyph might be something like (for example) a cedilla stuck onto a plain-ASCII "c", or something.

So both codepoints represent a printable glyph, that probably exists in the font...

The variant selectors are different, in that they are not printable glyphs at all; what they do is say "See that glyph you just fetched from the font - well, don't use that, use a different-but-similar glyph, if the current font contains it..."

Now, it appears that OSX font engine understands that and does something sensible; it appears that X11/Win32 font engines do not.

Further, fltk then tries to print the variant selector as if it were a printable glyph, and the font engine therefore prints the missing glyph replacement character on the screen.

So... unless we can figure out how to get X11/XFT/Win32/whatever to honour the variant selectors, we might be better off eliding them from the printable string; in that case we get the default glyph representation (if it exists in the font) which is better than nothing, and likely better than what we currently have...

Or maybe not. I do not know.


>
> However, I don't find them in the lists of fl_nonspacing(), aka
> XUtf8IsNonSpacing(). And, BTW, fl_nonspacing() returns `unsigned short'
> which would not be sufficient for the variation selectors mentioned
> above (the convention seems to be to return either 0 or the character
> (code point)). Wondering...

As I understand it, fl_nonspacing() only covers the BMP, so an unsigned short is fine. The variant selectors are (IIRC) part of the BMP, so they are fine.

Whether fl_nonspacing() should be extended to cover all the supplemental planes (and hence cover the additional selectors that Michael reports) is a big question.
If we try to cover all the SMP's, I suppose a lookup table might not be the best option...


>
> Another note: the code committed recently for Mac OS seems to convert
> the UTF-8 string to UTF-16, which would result in surrogate pairs (two
> short's) for the code points above. Do we support these, or not? What
> if
> they appear in user text strings?

I think we handle surrogate pairs OK for the most part, at least on OSX. I think they are OK on Win32 as well.
For X11/XFT we tend to use UCS4 where we hit these sort of codepoints, so the pairs issue does not arise.

Michael Bäuerle

unread,
Jul 20, 2015, 11:41:09 AM7/20/15
to fltkg...@googlegroups.com
MacArthur, Ian (Selex ES, UK) wrote:
>
> [...]
> So... unless we can figure out how to get X11/XFT/Win32/whatever to
> honour the variant selectors, we might be better off eliding them from
> the printable string; in that case we get the default glyph
> representation (if it exists in the font) which is better than nothing,
> and likely better than what we currently have...

+1 from me for this way to go (at least for X11).

I think it is very unlikely that a single font contain mutliple
variants of such glyphs, in most cases you would need an additional
Emoji-Font for the alternate versions. So, at least for the time the
X11 font rendering of FLTK doesn't support glyph substitution, it may
be fully acceptable to strip the variation selectors completely.

> > However, I don’t find them in the lists of fl_nonspacing(), aka
> > XUtf8IsNonSpacing(). And, BTW, fl_nonspacing() returns `unsigned short’
> > which would not be sufficient for the variation selectors mentioned
> > above (the convention seems to be to return either 0 or the character
> > (code point)). Wondering...
>
> As I understand it, fl_nonspacing() only covers the BMP, so an
> unsigned short is fine. The variant selectors are (IIRC) part of the
> BMP, so they are fine.
>
> Whether fl_nonspacing() should be extended to cover all the
> supplemental planes (and hence cover the additional selectors that
> Michael reports) is a big question.

I think it already supports them (in the sense of "ignoring them").
Nothing will match and the return value will always be zero if a
non-BMP codepoint is used as parameter.
(I assume that 'unsigned int' can hold the codepoint value.
It should be at least 32 bit on all supported platforms).

> If we try to cover all the SMP’s, I suppose a lookup table might not
> be the best option...

The variation selector check may only add six comparisons. Additional
lookup tables should not be required (at least if we want to treat them
all the same, because the whole ranges are variation selectors).

/* Overlaps the last region in 'XUtf8IsNonSpacing()' */
if (ucs <= 0xFE0F) {
if (ucs >= 0xFE00) return ucs;
}

if (ucs <= 0x1F3FF) {
if (ucs >= 0x1F3FB) return ucs;
}

if (ucs <= 0xE01EF) {
if (ucs >= 0xE0100) return ucs;
}

Maybe these checks should go into a separate function because not
all nonspacing characters should be removed (e.g. FLTK supports
decomposed umlauts on X11 - it prints the base character and then
prints a diaeresis glyph on top of it whithout spacing). This looks
from ugly to unreadable for capital letters, but is better than
nothing and should not be broken by the new code.

This new function could be called from 'XUtf8IsNonSpacing()' if it
should report variation selectors as nonspacing too. And it can
be called directly from the code that decides whether a codepoint
should be ignored because it is a variation selector.


... loudly thinking, please correct me if required.

Michael Bäuerle

unread,
Jul 20, 2015, 11:51:53 AM7/20/15
to fltkg...@googlegroups.com
Albrecht Schlosser wrote:
> On 16.07.2015 19:16 Michael Bäuerle wrote:
> >
> > We should remember here, that the variation selectors in the BMP listed
> > above are not the only ones. There are more variation selectors, at
> > least:
> >
> > In the SMP for skin tone: U+1F3FB..U+1F3FF
> > In the SSP for CJK style: U+E0100..U+E01EF
> >
> > If the font rendering engine don’t have support for styles, I think
> > we should map all of them out.
>
> Maybe these are all non-spacing characters? If yes, they /should/
> return true when checked with fl_nonspacing().

I understand it this way (a variation selector always is a modifier
and neither printable nor spacing alone). It only changes the
appearance of a former character.
Reply all
Reply to author
Forward
0 new messages