Today the spec says:
The agency_lang field contains a two-letter ISO 639-1 code for the primary language used by this transit agency.
I've heard people say that this is old fashioned and should be updated because ISO 639-1 alone can not represent many languages such as:
zh-Hant Traditional Chinese
es-VE (Venezuela)
es-AR (Argentina)
fil (Filipino)
sr-Latn (Serbian-written-with-Latin-letters)
This proposal is to change to BCP 47. Justifications:
- Let GTFS use the same language codes as other standards, in particular XML and HTML
- ISO 639-1 has no code for the majority of the world's languages; see above
- BCP 47 tags are a superset of ISO 639-1, so old GTFS files continue to be valid
- See
http://www.w3.org/International/articles/bcp47/ for additional justification.
Here is the proposed text:
agency_lang, Optional - The agency_lang field contains a <a href="
http://www.rfc-editor.org/rfc/bcp/bcp47.txt">IETF BCP 47 language code</a> for the primary language used by this transit agency, for example <code>en</code> for English or
<code>es-AR</code> for Spanish
(Argentina). BCP-47 are the language identifiers used in HTML and XML documents. Please refer to
http://www.w3.org/International/articles/language-tags/ for an
introduction.
Are there any data providers who have had difficulty picking an agency_lang value because of the limitations of ISO 639-1? In any case, I think that xml and html uses a language tag for the same reason that GTFS has agency_lang and it won't hurt us to take advantage of their experience.