PersonNames - calculating maximal likely locale from likely subtags

5 views
Skip to first unread message

Kip Cole

unread,
Apr 22, 2023, 5:28:41 AM4/22/23
to CLDR Users Public Mail List
In my implementation of PersonName formatting, I am stuck at the part of the spec that says:
  1. Otherwise, find the maximal likely locale for the name script and return its base language (first subtag).
Where maximal likely locale is defined as:

The term maximal likely locale used below is the result of using the Likely Subtags data to map 
> from a locale to a full representation that includes the base language, script, and region.

Given I have the script “Latn” its not clear to me how to derive the language since the definition
of maximal likely locale isn’t clear (to me).

The likely subtags data is ordered lexically by “from” locale. Given the example of script “Latn
then the first (lexically) language that references “Latn” is:

> <likelySubtag from="aa" to="aa_Latn_ET”/>

But I don’t think the intention would be to resolve the language as “aa” from the script “Latn”.

I’d much appreciate an understanding of how to derived the maximal likely locale so I can return
its base language.

Many thanks, —Kip





Mark Davis Ⓤ

unread,
Apr 23, 2023, 12:24:52 AM4/23/23
to Kip Cole, CLDR Users Public Mail List
Kip, 

Thanks for the question. The LDML spec should explain how to use Likely subtags, and what a Unicode language/locale ID looks like when the base language is missing (you use "und", so "und-Latn" would be the starting point). Would you mind checking the LDML spec to see where it is misleading you (or incomplete), and we'll fix it it to be clearer.

Thanks,

Mark

--
You received this message because you are subscribed to the Google Groups "CLDR Users Public Mail List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cldr-users+...@unicode.org.
To view this discussion on the web visit https://groups.google.com/a/unicode.org/d/msgid/cldr-users/40219471-3FE7-407D-8E92-D6EC317521FD%40gmail.com.

Kip Cole

unread,
Apr 23, 2023, 12:28:32 AM4/23/23
to Mark Davis Ⓤ, CLDR Users Public Mail List
Mark, thanks for the response, much appreciated.  I eventually got the courage to dive into the icu4j code and realised using “und” as the base language to resolve the maximal locale using likely subtags is the key.  Which having seen it makes perfect sense. I think the spec could use a little improvement to make that more explicit.  If you’re ok with that suggestion I’ll raise a ticket and draft a spec PR.

Thanks again, —Kip

Mark Davis Ⓤ

unread,
Apr 23, 2023, 12:34:15 PM4/23/23
to Kip Cole, CLDR Users Public Mail List
Sounds good, thanks!
Reply all
Reply to author
Forward
0 new messages