On 05/11/2016 02:04 AM, Daniel Dehennin wrote:
> Karl Williamson <
pub...@khwilliamson.com> writes:
>
>> On 05/09/2016 08:53 AM, Daniel Dehennin wrote:
>>> Hello,
>>>
>>> I tried to make my Perl5 code unicode compliant after reading a post on
>>> stackoverflow[1].
>>>
>>> As suggested in the post:
>>>
>>> “always run incoming stuff through NFD and outbound stuff from NFC.”
>>>
>>> I got a hard time finding why my Test::More was failing but displaying
>>> exactly the same strings for “got” and “expected”.
>>>
>>> I finally check how UTF-8 sources are handled and found that they are in
>>> NFC form, I run the following script:
>
> [...]
>
>> I'm afraid that when it comes to normalization in Perl5, you have to
>> do it yourself. I hear that Perl6 is much friendlier in this regard,
>> but I have no personal experience with it. Your $unistring is in
>> whatever normalization you made it when you typed it into your editor,
>> or whatever your editor did with it as you were typing. You could
>> have typed it in NFD, but probably the most natural way to enter
>> things on your keyboard will underlying it all be NFC.
>
> That's what I finally find out in another post, normally all my inputs
> are NFD but my tests used static string to match, I declared them with
> NFD to make it explicit.
>
> I added a note in my POD to signal that the sub returns NFD strings.
the locale is recognized by Perl5 to be a UTF-8 locale. It depends on
the libc implementation for your platform. There are bugs in Perl5's