To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/CABHGxq6ekz4gJbDVosaXa3fTo%2B_ghgYZevNw124JK41%2BaJ01rg%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/7327b78d-951b-41c2-91f3-9c80ee14ed17%40mail.shortwave.com.
Hi everyone,
I also feel that it's better to think about this as of two problems outlined by Kim, even though there is potentially a lot of overlap, and definitely lot of space for a fuzzy search. Let me add a few points:
Why adding English verb glosses to n,vs entries is a good idea:
What might be a good way to start adding the glosses:
As for the "computational method": With very little money it would be easy to do using a commercial LLM. I tried ChatGPT on a dozen examples and it did a pretty good (i.e. educated human-grade) job, given that I used only the English glosses. Adding the Japanese words may (or may not) improve it. GPT 3.5 now costs $0.001/$0.002 for 1K input/output tokens, so potentially we could get provisional glosses for a thousand entries for a few dollars. The glosses would still require human review, but this would save a ton of work. It seems that there is only 13,969 "vs" entries (I guess most of them "n,vs"), correct?
--
Adam Nohejl
To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/FD9ECAD2-FE43-4A10-A8FF-4BA7E1E9C6EE%40gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/CE8B6DDC-6F71-45A7-85F0-DBA290A9DBD3%40nohejl.name.
To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/CE8B6DDC-6F71-45A7-85F0-DBA290A9DBD3%40nohejl.name.