my preference would be "- Approach IANA about an RDF edition of the BCP47 subtag registry ".Btw., since we had a mail exchange about the topic a while ago, there has been a discussion in the W3C i18n working groupAt the moment that group is working on guidance about language tags and locale identifiers, in which RDF related guidance would fit very well, see
How about Wikidata(https://www.wikidata.org/)? For example https://www.wikidata.org/wiki/Q36236 is for Malayalam and has linked to several identifiers.
--
You received this message because you are subscribed to the Google Groups "open-linguistics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open-linguisti...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/open-linguistics/5104cd19-57d6-47d0-99a5-2616bca01eb1o%40googlegroups.com.
Hi, I wrote on this before to the group:
I think it’s important to realise that ISO639-3 does indeed have its problems, not least of which is the “apparent” descriptor<>tag mismatch as do the alternatives and variants, and it is confusing.
This really boils down to the creation and agreement of a source index of identifiers for languages, dialects, written languages and scripts, of which to my knowledge no such system has yet been completed thoroughly.
Hi, I wrote on this before to the group:
I think it’s important to realise that ISO639-3 does indeed have its problems, not least of which is the “apparent” descriptor<>tag mismatch as do the alternatives and variants, and it is confusing.
I have adopted ISO639-3 previously, however I was forced to adopt a hybrid version including all available language tags from all systems in an application we were building, and, we allowed for different “PROPER-NOUNS” in “any language” to be added as a altTag for any of those languages.
However I think it is important to realise that 639-3 does by far the better job of having the most scope of languages but in my opinion we are dealing with “spoken languages” here, even though many languages are accurately represented as written languages too. Whereas I sense some confusion between this and the differences between similar language mapping widely used in HTML/XML some of the larger multinational localisation vendors and the “localisation industry status quo” in general such as (lang_country) mappings like (ES_MX) or (EN_GB, EN_US) etc. In addition, in my opinion, another principal issue here is the point of view of the culture defining the standard, i.e. a westernised English speaking point of view, which of course is based on text mapping assumptions on ISO text mappings and character sets. E.g., where does something like “written traditional Chinese vs Simplified Chinese” come into any of the systems referred to above.
This really boils down to the creation and agreement of a source index of identifiers for languages, dialects, written languages and scripts, of which to my knowledge no such system has yet been completed thoroughly.
That’s My 2 cents of opinion. Please feel free to reach out to me if you feel strongly about this, and I apologise if I have offended anybody.
Kind regrads
Ronan
From: Felix Sasaki [mailto:fe...@sasakiatcf.com]
Sent: Friday 7 August 2020 12:32
To: Christian Chiarcos <christian...@web.de>
Cc: santhosh....@gmail.com; open-linguistics <open-lin...@googlegroups.com>; Linked Data for Language Technology Community Group <public...@w3.org>; public-...@w3.org
Subject: Re: [open-linguistics] Re: ISO 639 URIs
Dear Christian and all,
FYI and in case you have further comments, I brought this thread to the attention of the W3C i18n working group, see this issue
also, W3C has started work again on a draft about "language tags and locale identifiers", see the editors copy here
that version contains also some guidance about working with language tags in the context of RDF, see
Feel free to provide feedback here or within the W3C GitHub, we'd be more than happy to take this into account.
Best,
Felix