Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Controversial region subtags (BCP 47)

21 views
Skip to first unread message

Gordon P. Hemsley

unread,
Jul 7, 2011, 1:40:02 PM7/7/11
to Tim Chien

Tim has continuously raised the issue about politically-charged changes
to the names of regions associated with region subtags. While I am
sympathetic to his point, I had not spent much time on it, because I
didn't think we were at that point yet.

But then I realized that the issues he was referring to were actually in
the patch that is up for review right now.

As such, I did a review of the changes to region names that are in that
patch, so see what I need to override.

Some of them are fairly straightforward, simply using the name that is
more common in English:
FM: Federated States of Micronesia => Micronesia
IR: Islamic Republic of Iran => Iran
KP: Democratic People's Republic of Korea => North Korea
KR: Republic of Korea => South Korea
LA: Lao People's Democratic Republic => Laos
LY: Libyan Arab Jamahiriya => Libya
SY: Syrian Arab Republic => Syria
TZ: United Republic of Tanzania => Tanzania
VA: Holy See (Vatican City State) => Vatican City
VN: Viet Nam => Vietnam

Then there are the two Congos, which are differentiated by their capital
cities:
CD: The Democratic Republic of the Congo => Congo-Kinshasa
CG: Congo => Congo-Brazzaville

There is one case where I think a change is simply a correction in spelling:
RE: Reunion => R�union

In the case of the United Arab Emirates (AE), I am curious as to why the
old codebase uses 'U.A.E.' instead of the full name. Was there political
motivation for that? Because, if not, I am inclined to replace it with
the full expanded name.

In addition, there's the region encoded by the subtag 'CI': Its French
name is C�te d'Ivoire, which translates to Ivory Coast in English. As I
understand it, the region prefers its French name to be used even in
English, but the English translation is common in en-GB. (I'm not sure
what the usage is in en-US.) Which is our preferred name?

In the case of Macedonia (MK), it seems that the preferred usage is
'Macedonia' first, with the 'former Yugoslav Republic' abbreviated
after. Is that correct? (Wikipedia seems to suggest that this issue will
soon be resolved altogether, which would be nice.)

In the case of Taiwan (TW), I take it that the region objects to the
mention of 'Province of China' following 'Taiwan'?

So, those are the cases that I know something about. There are also a
few cases that I would like more information on:

Previously, there was a subtag 'MF', which indicated the region of
"Saint Martin". Now, it appears, there is a new subtag 'SX' for "Sint
Maarten (Dutch part)", which renamed 'MF' to "Saint Martin (French
part)". Now, my question is, essentially: Do we want to keep the
parentheticals or remove them?

And then there is the case of 'SH' and 'TA': 'SH' previously represented
"Saint Helena". Now it has been augmented to represent "Saint Helena,
Ascension and Tristan da Cunha". However, at the same time, 'TA' was
introduced to represent "Tristan da Cunha" on its own. This confuses me;
can anyone enlighten as to what the difference is? Do we want to change
one, or should we leave them as-is?

And, finally, there were some other new additions:
BQ: Bonaire, Saint Eustatius and Saba
CP: Clippterton Island
CW: Cura�ao
DG: Diego Garcia
EA: Ceuta, Melilla
EU: European Union
IC: Canary Islands

I don't anticipate any conflicts with those names, but I wanted to
mention them to make sure I wasn't missing anything.

The uncontroversial controversial changes have already been made to the
file that generates the master list, but I wanted to get feedback on the
rest of them before I commit the changes and update the patch.

Hope this helps everybody.

Gordon

--
Gordon P. Hemsley
http://gphemsley.org/ � http://gphemsley.org/blog/

Nguyen Vu Hung

unread,
Jul 7, 2011, 2:49:59 PM7/7/11
to Gordon P. Hemsley, dev-...@lists.mozilla.org, Tim Chien
On Fri, Jul 8, 2011 at 00:40, Gordon P. Hemsley <gphe...@gmail.com> wrote:

> VN: Viet Nam => Vietnam

+1

The country has only one name whose short form is Vietnam or "Viet Nam".
It does not have the problem China and Korea do.


--
Best Regards,
Nguyen Hung Vu [aka: NVH] ( in Vietnamese: Nguyễn Vũ Hưng )
vuhung16plus{remove}@gmail.dot.com , YIM: vuhung16 , Skype: vuhung16plus,
twitter: vuhung, MSN: vuhung16

Gen Kanai

unread,
Jul 8, 2011, 12:01:25 AM7/8/11
to Gordon P. Hemsley, dev-...@lists.mozilla.org

On 7/8/11 2:40 AM, Gordon P. Hemsley wrote:
> The uncontroversial controversial changes have already been made to the
> file that generates the master list, but I wanted to get feedback on the
> rest of them before I commit the changes and update the patch.

Gordon,

Thank you for your thorough work here.

I'd like to add that Southern Sudan is breaking away from Sudan very
soon, so it may make sense to add that in now vs. later.

https://secure.wikimedia.org/wikipedia/en/wiki/Southern_Sudan

Gen

--
Gen Kanai

Gordon P. Hemsley

unread,
Jul 8, 2011, 12:08:57 AM7/8/11
to

Gen,

Well, these files are generated from the IANA Language Subtag Registry,
so I'll only be able to do that once the subtag is registered, which
will likely only happen after ISO 3166 assigns Southern Sudan a region
code. (That'll likely take a couple of months.)

Nevertheless, if I had to guess, I would say that we'll be adding an
'SS' region subtag in the near future. The question then will become
whether it's listed as "South Sudan" or "Republic of South Sudan". (I'm
guessing the former.)

Gordon

--
Gordon P. Hemsley
http://gphemsley.org/http://gphemsley.org/blog/

Tim Chien (MozTW)

unread,
Jul 8, 2011, 1:14:04 AM7/8/11
to Gordon P. Hemsley, Gen Kanai, Mozilla l10n
That's correct. Really appreciated your through investigation.
Please let me know when it's time to implement the name rewrite.


Tim

Gordon P. Hemsley

unread,
Jul 8, 2011, 1:23:51 AM7/8/11
to

Once I've clarified all my questions, I'll commit the change to my repo
on GitHub[1] and update the patch in bug 666662[2]. You're already CC'd
there, so you'll know when it happens. :)

Gordon

[1] https://github.com/GPHemsley/BCP47
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=666662

--
Gordon P. Hemsley
http://gphemsley.org/http://gphemsley.org/blog/

Gordon P. Hemsley

unread,
Jul 8, 2011, 2:36:17 AM7/8/11
to
[I just realized that this discussion might be appropriate for
mozilla.dev.i18n, so cross-posting this reply there, as well.]

> RE: Reunion => Réunion


>
> In the case of the United Arab Emirates (AE), I am curious as to why the
> old codebase uses 'U.A.E.' instead of the full name. Was there political
> motivation for that? Because, if not, I am inclined to replace it with
> the full expanded name.
>
> In addition, there's the region encoded by the subtag 'CI': Its French

> name is Côte d'Ivoire, which translates to Ivory Coast in English. As I

> CW: Curaçao


> DG: Diego Garcia
> EA: Ceuta, Melilla
> EU: European Union
> IC: Canary Islands
>
> I don't anticipate any conflicts with those names, but I wanted to
> mention them to make sure I wasn't missing anything.
>
> The uncontroversial controversial changes have already been made to the
> file that generates the master list, but I wanted to get feedback on the
> rest of them before I commit the changes and update the patch.
>
> Hope this helps everybody.
>
> Gordon
>

I just did some archaeology[1], and it does not appear that as much
thought was put into these names as I had assumed. The relevant
discussion is in bug 153104.[2]

Given this, I will use my judgement (plus guidance from Wikipedia) on a
per-subtag basis as to whether to use the default name or change it,
unless I receive a specific complaint (e.g. TW and MK).

That means that I'll be doing the following unless instructed otherwise:
* Reverting AE to "United Arab Emirates"
* Reverting CI to "Côte d'Ivoire"
* Reverting FM to "Federated States of Micronesia"
* Keeping MK as "Macedonia, F.Y.R. of"
* Keeping TW as "Taiwan"
* Changing MF and SX to "Saint-Martin" (French) and "Sint Maarten"
(Dutch), respectively
* Leaving SH and TA as-is
* Leaving IR, KP, KR, LA, LY, SY, TZ, VA, and VN as-is
* Leaving CD and CG as-is
* Accepting RE change to "Réunion"
* Accepting additions of BQ, CP, CW, DG, EA, EU, and IC

(These are all relative to how they exist within the current codebase.)

Any objections?

Gordon

[1]
http://bonsai.mozilla.org/cvslog.cgi?file=mozilla%2Ftoolkit%2Flocales%2Fen-US%2Fchrome%2Fglobal%2FregionNames.properties
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=153104

--
Gordon P. Hemsley
http://gphemsley.org/http://gphemsley.org/blog/

Gordon P. Hemsley

unread,
Jul 8, 2011, 2:24:43 PM7/8/11
to

Having heard no objection in the past 12 hours, I've committed the
change to my repo[1].

I'll give it a little more time before I respin the patch for bug 666662[2].

[1]
https://github.com/GPHemsley/BCP47/commit/db6ed5a3f66e8acd27a2eea671f6c9ec2291983f
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=666662

Gordon P. Hemsley

unread,
Jul 10, 2011, 2:41:58 AM7/10/11
to
On 7/8/11 2:24 PM, Gordon P. Hemsley wrote:
> Having heard no objection in the past 12 hours, I've committed the
> change to my repo[1].
>
> I'll give it a little more time before I respin the patch for bug
> 666662[2].
>
> [1]
> https://github.com/GPHemsley/BCP47/commit/db6ed5a3f66e8acd27a2eea671f6c9ec2291983f
>
> [2] https://bugzilla.mozilla.org/show_bug.cgi?id=666662
>

I have respun the patch:
https://bugzilla.mozilla.org/show_bug.cgi?id=666662
https://bugzilla.mozilla.org/attachment.cgi?id=545051

Jean-Marc Desperrier

unread,
Jul 13, 2011, 4:11:49 AM7/13/11
to
Gordon P. Hemsley wrote:
> There is one case where I think a change is simply a correction in
> spelling:
> RE: Reunion => Réunion

I think it's a good thing to make that change, English speaking people
are frequently confused to see the word "reunion" appear in the middle
of a list of countries/locations. The "é" gives an hint it's something
else, and makes quite a few dig more in their memory and realize "Oh,
haven't I heard once about a french island named 'Réunion'".

Actually in French we always say 'La Réunion'
(http://fr.wikipedia.org/wiki/La_R%C3%A9union), if the English usage
were the same, there would be even less ambiguity.

Jean-Marc Desperrier

unread,
Jul 13, 2011, 4:44:16 AM7/13/11
to
Gordon P. Hemsley wrote:
> there's the region encoded by the subtag 'CI': Its French name is
> Côte d'Ivoire, which translates to Ivory Coast in English. As I

> understand it, the region prefers its French name to be used even in
> English, but the English translation is common in en-GB. (I'm not
> sure what the usage is in en-US.) Which is our preferred name?

I don't know what the "man in street" usage is, but the official one
seems to lean quite strongly toward "Côte d'Ivoire" :
http://travel.state.gov/travel/cis_pa_tw/tw/tw_5501.html
http://www.state.gov/r/pa/ei/bgn/2846.htm
https://www.cia.gov/library/publications/the-world-factbook/geos/iv.html
http://www.africa.upenn.edu/Country_Specific/Cote.html
(all of those sites don't even state something like "Commonly also named
Ivory Coast" ...)

> Previously, there was a subtag 'MF', which indicated the region of
> "Saint Martin". Now, it appears, there is a new subtag 'SX' for "Sint
> Maarten (Dutch part)", which renamed 'MF' to "Saint Martin (French
> part)". Now, my question is, essentially: Do we want to keep the
> parentheticals or remove them?

I think it's depends on what the established English usage is, but it's
a bit ambiguous to remove them. IMO you could shorten to '(French)' and
'(Dutch)'.

Some maps make the distinction by using "Saint Martin" for the French
part and "Sint Maarten" for the Dutch part, but many use both terms
indifferently for the whole Island, and use French/Dutch, or a flag, to
separate each part.

> And then there is the case of 'SH' and 'TA': 'SH' previously represented
> "Saint Helena". Now it has been augmented to represent "Saint Helena,
> Ascension and Tristan da Cunha". However, at the same time, 'TA' was
> introduced to represent "Tristan da Cunha" on its own. This confuses me;
> can anyone enlighten as to what the difference is? Do we want to change
> one, or should we leave them as-is?

Wikipedia states, surprising as it is, that the 'SH' code is actually
ambiguous : It's both the whole Saint Helena, Ascension and Tristan da
Cunha territory, and Saint Helena as an individual entity, with the TA
code then for "Tristan da Cunha" :
http://en.wikipedia.org/wiki/Saint_Helena,_Ascension_and_Tristan_da_Cunha#Administrative_divisions

> The uncontroversial controversial changes have already been made to the
> file that generates the master list, but I wanted to get feedback on the
> rest of them before I commit the changes and update the patch.

Sorry for the late feedback, hope it's still useful.

Jonathan Kew

unread,
Jul 13, 2011, 5:01:06 AM7/13/11
to dev-...@lists.mozilla.org
On 13 Jul 2011, at 09:44, Jean-Marc Desperrier wrote:

> Gordon P. Hemsley wrote:
>> there's the region encoded by the subtag 'CI': Its French name is
>> Côte d'Ivoire, which translates to Ivory Coast in English. As I
>> understand it, the region prefers its French name to be used even in
>> English, but the English translation is common in en-GB. (I'm not
>> sure what the usage is in en-US.) Which is our preferred name?
>
> I don't know what the "man in street" usage is, but the official one seems to lean quite strongly toward "Côte d'Ivoire" :
> http://travel.state.gov/travel/cis_pa_tw/tw/tw_5501.html
> http://www.state.gov/r/pa/ei/bgn/2846.htm
> https://www.cia.gov/library/publications/the-world-factbook/geos/iv.html
> http://www.africa.upenn.edu/Country_Specific/Cote.html
> (all of those sites don't even state something like "Commonly also named Ivory Coast" ...)

"Ivory Coast" is certainly used in en-GB, but so is "Côte d'Ivoire". Anecdotally (FWIW), I'd say there has been a substantial shift towards the French form in the last few decades. As a child, I only recall hearing the English form. Then I started to notice some use of the French, but initially it seemed a rather forced, pretentious usage, like people were consciously trying to be noticed for their cultural sensitivity and correctness. Nowadays, the French form is widely used/recognized, and doesn't stand out as "odd" or self-conscious in the way it did a couple of decades ago; if anything, using "Ivory Coast" might now mark one as being a bit old-fashioned/reactionary, depending on the context.

JK

0 new messages