Reading Checks project

15 views
Skip to first unread message

freddie matthews

unread,
Apr 21, 2021, 1:16:05 AM4/21/21
to edict-...@googlegroups.com
Hey everyone,

I did a small project a while back to crosscheck the readings of JMDict entries with those in Daijirin. Here's the sorted list of 237 entries which exist in both and where the readings differ. Some notes:

* Entries that had multiple readings in either were not considered
* This was done using an older version of Daijirin
* The edict file is also stale (Dec 2020) so entries might have been updated since then (e.g. 半濁音符)
* A few entries are false positives where Daijirin wrote the reading in hiragana and we have it in katakana
* Some other false positives or cases where both are valid
* Something similar could probably be done with MeCab/Unidic

I'll probably work through this list when I'm inclined to, but wanted to bring it up here for context and in case I never do, it's up for grabs of course.

– Opencooper

Jim Breen

unread,
Apr 21, 2021, 1:50:09 AM4/21/21
to edict-...@googlegroups.com
Looks interesting. I'm a bit tired up with medical matters at present but I'm looking forward to going through the list.

数年 was a surprise, but I see Daijirin actually has both. All JEs have すうねん.

Jim


--
You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/CANy6PaUMp405_qw%3Dfy28XdMbs_0YesBjhYGe_uAo9EQA1taQig%40mail.gmail.com.

Jim Breen

unread,
Apr 22, 2021, 9:51:09 PM4/22/21
to edict-...@googlegroups.com
I've been poking at a few on the list and proposed some amendments. If it's OK I think I'll tip them into a www page which I can annotate.

Quite a few are valid alternatives, or like the 辛く mismatch derive from two different adjectives.

It will be next week before I can do much with it.

Jim

Jim Breen

unread,
Apr 23, 2021, 4:26:02 AM4/23/21
to edict-...@googlegroups.com
OK, the first hundred or so are at http://www.edrdg.org/~jwb/jmdaijrcomp.html

Jim

freddie matthews

unread,
Apr 23, 2021, 10:05:27 AM4/23/21
to edict-...@googlegroups.com
Thanks for taking the time to go through these Jim. I trust you to do a much better job with them than me. The table you made is nice, good work so far. Sorry in advance for all the false positives. If you get tired of them let me know and I'll continue.

– Opencooper

Jim Breen

unread,
Apr 28, 2021, 7:53:25 PM4/28/21
to edict-...@googlegroups.com
I've gone through the first 25 or so and the result has been a few
changes to entries.
Feel free to pitch in; the more the merrier.

Jim
--
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/
http://nihongo.monash.edu/
Reply all
Reply to author
Forward
0 new messages