dictionary file update dates

21 views
Skip to first unread message

Martin Pfundmair

unread,
Jan 24, 2021, 12:02:48 AM1/24/21
to edict-...@googlegroups.com
Dear list members,

I'm sure this has been asked 

Is there some sort of automatically updated file with all the latest creation dates for the dictionary files. In particular I'm interested in the xml versions JMdict and KANJIDIC2. I've seen corresponding things for other dictionary files like

enamdict

and kradfile


I'm trying to write a mobile application using the files and an easy way to assess whether the current local database is outdated without downloading the whole file would be helpful

This depends of course also on whether or not those dates only change if there's an actual change in the contents of the file or it's just the timestamp of recompilation even if there are no changes.


Best regards,

Martin

Ben Bullock

unread,
Jan 24, 2021, 2:57:47 AM1/24/21
to edict-...@googlegroups.com
On Sun, 24 Jan 2021 at 14:02, Martin Pfundmair <martin.p...@gmail.com> wrote:
Dear list members,

I'm sure this has been asked 

Is there some sort of automatically updated file with all the latest creation dates for the dictionary files.

You're meant to use rsync to do this.


 

Martin Pfundmair

unread,
Jan 24, 2021, 3:47:48 AM1/24/21
to edict-...@googlegroups.com
Thank you for the hint, I saw that part of the documentation.
However, rsync is not really an option for me at the moment on the platform I'm using.
So I was wondering if there's maybe a different way to check for updates consistently.


Jim Breen

unread,
Jan 24, 2021, 3:40:43 PM1/24/21
to edict-...@googlegroups.com
Both the JMdict and Kanjidic XML files are generated  daily, so up to a point the latest creation date is irrelevant. The actual content of JMdict almost always changes each day, but Kanjidic usually only has a couple of changes per month. I need to look into it but maybe I can log the date of the most recent content change.

Jim


--
You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/CAAWHJxy8FMQVn5t-tPPERpAK1wrMitYyurwRjTDptUXd3er7kA%40mail.gmail.com.

Jim Breen

unread,
Jan 26, 2021, 12:30:46 AM1/26/21
to edict-...@googlegroups.com
A little bit more on this matter.

On Sun, 24 Jan 2021 at 16:02, Martin Pfundmair
<martin.p...@gmail.com> wrote:
> Is there some sort of automatically updated file with all the latest creation dates for the dictionary files. In particular I'm interested in the xml versions JMdict and KANJIDIC2. I've seen corresponding things for other dictionary files like
>
> enamdict
> http://ftp.edrdg.org/pub/Nihongo/enamdicthdr.txt

Yes, but this is always today's date, as the enamdict file is
generated daily, even if the database has had no updates.

> and kradfile
> http://ftp.edrdg.org/pub/Nihongo/kraddate

Ditto. This is just today's date.

That information on the ftp site is really legacy stuff from the days
when the creation/update was much more sporadic.

Both the JMdict and Kanjidic2 XML files contain the date of
generation, which takes place daily. For example kanjidic2.xml has a
header element containing:

<database_version>2021-026</database_version>
<date_of_creation>2021-01-26</date_of_creation>

And JMdict has a pseudo-entry containing:

<gloss>Japanese-Multilingual Dictionary Project - Creation Date:
2021-01-26</gloss>

I can see why you'd like to know if there's been a change in a file's
actual content. I'll see if I can put something together but it's not
exactly a high priority at the moment.

Cheers

Jim

--
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/
http://nihongo.monash.edu/

Martin Pfundmair

unread,
Jan 30, 2021, 2:35:08 AM1/30/21
to edict-...@googlegroups.com
Thank you for the more detailed answer, Jim.
Don’t worry about it, this should definitely be a low priority. 


As a side note, I've stumbled across your gitlab repository and saw that you're using PostgreSQL.
I'm working with SQL on a daily basis and could have a look at a possible sql query to fetch the latest modification dates from the database itself. If that's something you'd consider worth pursuing.


Have a good weekend!

Martin
 

--
You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict...@googlegroups.com.

Jim Breen

unread,
Feb 2, 2021, 12:23:31 AM2/2/21
to edict-...@googlegroups.com
On Sat, 30 Jan 2021 at 18:35, Martin Pfundmair
<martin.p...@gmail.com> wrote:
> As a side note, I've stumbled across your gitlab repository and saw that you're using PostgreSQL.
> I'm working with SQL on a daily basis and could have a look at a possible sql query to fetch the latest modification dates from the database itself. If that's something you'd consider worth pursuing.

To be frank I don't think it's worth pursuing. Barely a day passes
without at least one entry in the JMdict database being changed. The
JMnedict (names) database changes less often but there are still
usually several changes a week

I'd be looking to see if you can use rsync, as Ben suggested. Failing
that, aim to refresh the main files on a schedule, e.g. weekly.

Martin Pfundmair

unread,
Feb 2, 2021, 4:17:03 AM2/2/21
to edict-...@googlegroups.com
Thanks for the input. With the changes being so frequent, your assessment makes a lot of sense. 

Thanks again.

Martin


--
You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages