EPWING conversion script down?

24 views
Skip to first unread message

freddie matthews

unread,
Oct 27, 2022, 10:42:22 PM10/27/22
to edict-...@googlegroups.com
I actually noticed this a while back but forgot to report it, but on the EDRDG FTP server, the JMDict EPWING conversion that used to be generated weekly—edict_en.fpw.tar.gz—is not done so anymore.

Opening the current EPWING download and going to "About this conversion" has the following: JMdict(eng) 2021-02-26_UTC (DTD v1.07), conversion script v2.7.1. The JMdict meta entry has the creation date as 2021-02-24.

Would anyone know what's going on with the conversion? I personally found the EPWING file to be very useful and use it on both my laptop and phone.

– Opencooper

Jim Breen

unread,
Oct 27, 2022, 11:48:52 PM10/27/22
to edict-...@googlegroups.com
Thanks for pointing this out. I hadn't noticed it had stopped working.
Early in 2021, Monash Uni shut down its public FTP server. I hadn't
realised that Hans Loeffler's EPWING routines used that site to
collect the latest copies of files.

I've dug into his scripts and I think I've identified the line in the
Make file identifying the ftp server. I've swapped it to the edrdg.org
one, so all going well it will run again correctly. The weekly update
is due in a few hours so we'll know tomorrow whether it worked.

Cheers

Jim
> --
> You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/CANy6PaXRP0fw3Gp%3DR7QxRDGG%3DeM6G2erPUy_Vrqufwx3M3ZqCw%40mail.gmail.com.



--
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/
http://nihongo.monash.edu/

Jim Breen

unread,
Oct 28, 2022, 8:26:52 PM10/28/22
to edict-...@googlegroups.com
The weekly script ran. I checked its log file and I see that it
reported some problems with JMdict and JMmedict, but not with the
edict, kanjidic and examples files.

Can you check the edict edition and see if it is up-to-date? I'll ask
Hans to check out the JMdict situation.

Jim

freddie matthews

unread,
Oct 29, 2022, 1:26:51 AM10/29/22
to edict-...@googlegroups.com
Thanks for looking into it and updating the script Jim. Unfortunately, the dates are still the same. The folder actually includes the date in its name once unzipped: JMdict_eng_2021-02-26_UTC. I'm suspecting that we introduced a bunch of non-JIS kanji and other Unicode characters in that time period, which might be causing the conversion to fail. But perhaps Hans would know best what the issue could be.

– Opencooper

Jim Breen

unread,
Oct 29, 2022, 1:48:11 AM10/29/22
to edict-...@googlegroups.com
I think the JMdict conversion failed so you are probably seeing the old file. Can you check the edict one?

Jim


freddie matthews

unread,
Oct 30, 2022, 1:09:06 AM10/30/22
to edict-...@googlegroups.com
Jim, I'm only seeing one EPWING file for the main dictionary. Under the section "First the EDICT Files", there's edict_en.fpw.tar.gz, which is what unzips to JMdict_eng_2021-02-26_UTC. As for the other EPWING downloads available, kanjidic and jp_examples have 2022-10-28 in their unzipped folder names. The enamdict conversion doesn't have a timestamp once unzipped, but its file modification dates are from 2007. I don't see a listing for an EPWING version of JMdict on the FTP page.

– Opencooper

Jim Breen

unread,
Oct 31, 2022, 5:09:16 PM10/31/22
to edict-...@googlegroups.com
You're correct; there's only an "edict" EPWING edition.

No reply from Hannes. If I can't contact him that might be the end of the line for distributing that format.

Jim


Jim Breen

unread,
Dec 15, 2022, 6:01:07 AM12/15/22
to edict-...@googlegroups.com
A progress report on this. It's taken a while as I had to re-establish
a login method for Hannes to use our server (it's fairly locked up to
deter hackers.) It seems that it's failing on some odd readings such
as the タヒ in entry 1310720 (死). We'll probably have to drop such
readings from the EPWING edition as the format can't handle 半角カタカナ.

More eventually.

Jim

freddie matthews

unread,
Dec 15, 2022, 10:22:28 AM12/15/22
to edict-...@googlegroups.com
Thanks for the update Jim. I appreciate the work you and Hannes have put into looking into this so far. That's an odd reading indeed.

– Opencooper

Reply all
Reply to author
Forward
0 new messages