OCR for Japanese text

74 views
Skip to first unread message

Alexander Zapryagaev (萌覺・Miǒgacu)

unread,
Oct 23, 2025, 3:04:44 AM (14 days ago) Oct 23
to PMJS: Listserv
Hello everyone!

I wonder: what is the best tool to OCR Japanese text? I tried many throughout the years and am yet to find one that handles text in vertical column without problems. And even if it does a half-decent job, any appearance of furigana completely stumps any product I tried. What OCR solutions are used in Japan? Could someone recommend me a tool (preferrably open-source) that can confidently handle Japanese, both horizontal and vertical (and other East Asian languages as a bonus, if possible)?

Best wishes,

Alexander Zapryagaev

Kauê Metzger Otávio

unread,
Oct 23, 2025, 2:32:48 PM (13 days ago) Oct 23
to pm...@googlegroups.com
Surely someone here will have a better answer, but the best tool I've found so far is OnlineOCR (via browser on https://www.onlineocr.net/). I am testing Gemini Pro to extract text from images and it might be better, but I haven't used it extensively enough yet, as I have done with OnlineOCR. Having said that, though, OnlineOCR still presents some mistakes, sometimes misrepresenting a kanji or breaking it into multiple kanji and that sort of thing, so you'll always have to proofread it, just in case.

Kind regards,
Kauê Otávio

--
PMJS is a forum dedicated to the study of premodern Japan.
To post to the list, email pm...@googlegroups.com
For the PMJS Terms of Use and more resources, please visit www.pmjs.org.
Contact the moderation team at mod...@pmjs.org
---
You received this message because you are subscribed to the Google Groups "PMJS: Listserv" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pmjs+uns...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/pmjs/e2cf7549-a6f3-4dd4-8ef6-352168f80bfdn%40googlegroups.com.

Paula R. Curtis

unread,
Oct 23, 2025, 2:47:43 PM (13 days ago) Oct 23
to pm...@googlegroups.com
Dear all,

Please allow me to direct you to two previous threads on this topic:
https://groups.google.com/u/1/g/pmjs/c/ff6D3R4nmaI/m/nPOBDp-7xqIJ

To search the PMJS archive of messages, visit https://groups.google.com/u/1/g/pmjs and use the "Search Conversations" bar at the top of the page.

Best,

Paula



--
Dr. Paula R. Curtis
Operations Leader, Japan Past & Present
Yanai Initiative for Globalizing Japanese Humanities
Academic Administrator
Department of Asian Languages & Cultures, UCLA

Portfolio & Projects

Avery Morrow

unread,
Oct 24, 2025, 4:36:25 AM (13 days ago) Oct 24
to pm...@googlegroups.com
Hello everyone,

I noticed the previous threads are from a little while back and don't
mention the NDLkotenOCR software, which is free, open source, and
available for Windows, Mac and Linux. It's also much higher quality
than any other Japanese OCR software I've ever tried.

https://github.com/ndl-lab/ndlkotenocr-lite/blob/master/ndlkotenocr-lite-gui/README.md

Avery Morrow
Brown University
> To view this discussion visit https://groups.google.com/d/msgid/pmjs/CAL1MDVTSH0ru6%3DxsXXkQzq9e22_KMYtSZLcgiCRY1ecMVOzPDg%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages