Linux UTF-8, GetPartialTextExtents, and Diacritics

5 views
Skip to first unread message

Gerald Brandt

unread,
Nov 20, 2009, 8:28:28 AM11/20/09
to wx-users
Hi,

I'm using the latest trunk, and have a question about GetPartialTextExtents when dealing with UTF-8 and diacritic (combined characters).

I have the string " STARGATE SG-1 " (the 'A' in gate has the small circle on top of it it (http://wiki.urbandead.com/images/8/8b/Stargate_SG-1_Season_8_Title.jpg).

This is a table of what GetPartialTextExtents gives me for the UTF-8 string.  I'm using a monospaced font just to make things easier.


UTF-8     Extent
0x0020     7
0x0020   14
0x0053   21
0x0054   28
0x0041   35
0x0052   42
0x0047   49
0x039b   56
0x030a   63
0x0054   70
0x0045   77
0x0020   84
0x0053   91
0x0047   98
0x002d  105
0x0031  112
0x002c  119
0x0020  119
0x000d  119

The diacritic is 0x030a.  Should it not have an accumalated width that matches the character before it?  Since it has no width itself (it's printed in the same space as the previous letter).  And at the end of the extents, both 0x002c and 0x0020 have the same accumulated width.  It appears that GetPartialTextExtents kind of knows about diacritics, but applies them wrong in the array.

Am I reading this right, or am I way off base?

Gerald



Gerald Brandt

unread,
Nov 20, 2009, 8:34:58 AM11/20/09
to wx-u...@googlegroups.com
Just to reply to myself...

I have another string, longer, so I don't want to put it into a table for you.  It contains 4 diacritics.  One of them (and only one), behaves as I expect and describe above.  The other 3 do not, resulting in the last few characters having the same accumulated width.

Gerald

Vadim Zeitlin

unread,
Nov 20, 2009, 9:19:29 AM11/20/09
to wx-u...@googlegroups.com
On Fri, 20 Nov 2009 07:28:28 -0600 (CST) Gerald Brandt <g...@majentis.com> wrote:

GB> I'm using the latest trunk, and have a question about
GB> GetPartialTextExtents when dealing with UTF-8 and diacritic (combined
GB> characters).

The fact that GetPartialTextExtents() doesn't account for the fact that
some characters are combined with previous ones definitely looks like a
bug. Unfortunately I don't really know how to fix it, I'd advise looking at
Pango documentation as GetPartialTextExtents() is implemented in terms of
Pango functions in wxGTK. Maybe we just need to call some function/set some
flag to fix this... but I really don't know much about Pango and have no
time to learn it now.

Sorry,
VZ

--
TT-Solutions: wxWidgets consultancy and technical support
http://www.tt-solutions.com/

Gerald Brandt

unread,
Nov 20, 2009, 10:16:13 AM11/20/09
to wx-u...@googlegroups.com

----- "Vadim Zeitlin" <va...@wxwidgets.org> wrote:
> On Fri, 20 Nov 2009 07:28:28 -0600 (CST) Gerald Brandt <g...@majentis.com> wrote:
>
> GB> I'm using the latest trunk, and have a question about
> GB> GetPartialTextExtents when dealing with UTF-8 and diacritic (combined
> GB> characters).
>
>  The fact that GetPartialTextExtents() doesn't account for the fact that
> some characters are combined with previous ones definitely looks like a
> bug. Unfortunately I don't really know how to fix it, I'd advise looking at
> Pango documentation as GetPartialTextExtents() is implemented in terms of
> Pango functions in wxGTK. Maybe we just need to call some function/set some
> flag to fix this... but I really don't know much about Pango and have no
> time to learn it now.
>
>  Sorry,
> VZ
>
Thanks for the input.  I may be able to work around it in my code, but it would obviously be a hack.

Thanks,
Gerald

Gerald Brandt

unread,
Nov 20, 2009, 1:31:28 PM11/20/09
to wx-u...@googlegroups.com
----- "Gerald Brandt" <g...@majentis.com> wrote:
>
Unfortunately, Pango has some issues.  In my STARGATE example from my first post, Pango places the small circle above the T, and not above the A.  That's contrary to the use of diacritic characters.  The spec says the diacritic character comes after the character it's associated with.  Damn.  I just wish unicode was easier.

Gerald


Mark Gollahon

unread,
Dec 2, 2009, 2:10:10 PM12/2/09
to wx-u...@googlegroups.com
I can sometimes connect, but no data flows....

Vadim Zeitlin

unread,
Dec 2, 2009, 2:22:38 PM12/2/09
to wx-u...@googlegroups.com
On Wed, 02 Dec 2009 14:10:10 -0500 Mark Gollahon <mgol...@exacq.com> wrote:

MG> I can sometimes connect, but no data flows....

Seems to work fine for me.

Regards,

Bryan Petty

unread,
Dec 2, 2009, 2:46:33 PM12/2/09
to wx-u...@googlegroups.com
On Wed, Dec 2, 2009 at 12:10 PM, Mark Gollahon <mgol...@exacq.com> wrote:
> I can sometimes connect, but no data flows....

We had about 15 minutes of about 40% packet loss (effectively making
wxwidgets.org unreachable). No confirmation about hardware issues or
DDoS (my best guess is DDoS - though we weren't targeted), but the
issue has been mitigated, and we're back online. Sorry for the
inconvenience.

Regards,
Bryan Petty

Mark Gollahon

unread,
Dec 7, 2009, 5:52:25 PM12/7/09
to wx-u...@googlegroups.com
Thanks!
Reply all
Reply to author
Forward
0 new messages