Inconsistency in the lsource xml:lang attribute

8 views
Skip to first unread message

Nicolò Di Domenico

unread,
Jul 2, 2021, 11:44:21 AM7/2/21
to EDICT-JMdict
Hi everyone,

I'd like to point out a small inconsistency in the JMdict XML, particularly in the xml:lang attribute of the lsource tag, used to mark any entry's source terms.
When parsing it I noticed that, while almost every entry uses the ISO 639-2B standard for the three-letter language code (such as "dut", "ice", etc), some entries use "rum" for Romanian, which instead comes from the ISO 639-2T standard. The correct language code for Romanian would be "rom".
The only two entries I found with this kind of issue are 1144530 and 2833660.

Regards,
Nicolò

Jim Breen

unread,
Jul 3, 2021, 7:46:56 AM7/3/21
to edict-...@googlegroups.com
Thanks for the alert. It's probably a typo crept into a database table. Should be able to get it fixed in a few days.

Jim


--
You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/8a1828a5-bfc6-4f8f-ae6d-def2b6c0057en%40googlegroups.com.

Stuart McGraw

unread,
Jul 3, 2021, 12:30:24 PM7/3/21
to edict-...@googlegroups.com
Hi Nicolò,

Thanks for the feedback. The JMdictDB help at:

http://edrdg.org/jmdictdb/cgi-bin/edhelp.py#kw_lang

gives a source for the ISO-639-2 language code keywords:

https://www.loc.gov/standards/iso639-2/php/code_list.php

which lists the 2B code for Romanian (also Moldavian/Moldovan) as "rum". The "rom" code refers to the Romany language. Both are available in the JMdictDB languages table and either can be used in an entry depending on which language you wish to refer to.

Note that for brevity the languages listed in the the JMdictDB help file are only those actually used in some entry, not the full list of available ones. The text above the list mentions that although perhaps not too obviously.

Thanks,

--- Stuart
Reply all
Reply to author
Forward
0 new messages