On Wed, Jun 9, 2021 at 6:25 AM Alexandru Pojoga <
apo...@gmail.com> wrote:
>
> Were they scraping the contents of EDICT/Kanjidic?? Someone should tell them those are freely downloadable...
I can see a reason why someone would do that, and that's to scrape
information that is not in EDICT/Kanjidic. For instance, deleted
entries:
http://www.edrdg.org/jmdictdb/JMdict_deletedentries
I am myself in a situation where I am considering asking permission to
do something similar. My software uses the JMdict entry ID as a key
for users to mark entries of interest (i.e. for studying). As JMdict
evolves, some of these entries are getting removed. I just don't want
to silently delete user data whenever a deleted entry is referenced -
instead I'd like to display what the entry was looking like so the
user can find a replacement or just decide to drop it. To that end I
need the deleted entries' data and so far the only way I have found to
do this is to query all these entries one by one from the JMdict
database. If there is a better way to do it, I'd love to hear about
it. If not, Jim would you find it acceptable if I scraped all the
deleted entries once (and only once) over the course of several
days/weeks?
(and to answer a predictable question, I have nothing to do with the
scraping attempts mentioned above, neither have I attempted any of my
own so far :P)
Cheers,
Alex.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/edict-jmdict/CAFe6_TDpq0A1tcFeVix39%3DSnt1oSN9mHDKw%2BEcB%2BMdK9Txq3kQ%40mail.gmail.com.