OCR for Japanese Text

205 views
Skip to first unread message

Hirayama

unread,
Mar 12, 2008, 9:29:06 AM3/12/08
to Honyaku E<>J translation list
Dear colleagues,

is it possible to run OCR software for Japanese texts
(e.g. 読取革命) in an "international" Windows XP environment?

My computer system runs with Windows XP home edition
(localized for Germany).

Why am I asking this? I recieved 90 pages of a control panel
specification as PDF file (all information is presented as image)
and I would like to transform it into a TM processable format
(to speak more concretely: I would like to use at least OmegaT).

Another idea may be to find a provider for such file processing.
Is there such a service available in the net?

TIA for your kind assistance

Uwe Hirayama
JP>GER
hira...@freenet.de

Daniel

unread,
Mar 12, 2008, 9:54:16 AM3/12/08
to hon...@googlegroups.com
2008/3/12 Hirayama <hira...@freenet.de>:

Dear colleagues,

is it possible to run OCR software for Japanese texts
(e.g. 読取革命) in an "international" Windows XP environment?


Hirayama san,

If you don't have Japanese capabilities on your computer (although I would assume you do as a JA-GER translator), you may have trouble when the OCR software outputs the finished product to MS Word. In that case, you may want to invest in a Microsoft Japanese language pack for USD $24.95
http://tinyurl.com/2y9ce4

Best regards,
Daniel Anley

Chandru71

unread,
Mar 12, 2008, 11:33:59 AM3/12/08
to Honyaku E<>J translation list
Hirayam san

Just to check if you have, run the Microsoft office document Imaging.
(All Programs>Microsoft Office>Microsoft Office Tools>Microsoft office
document Imaging)
In that see Tools>Options>OCR>OCR Language.
If you find Japanese in that drop down box, then Japanese Language
pack is installed in your system.
I have an English version of XP and have installed Microsoft office
Proofing tools, which contains language features and OCR capability.
Though you have localized for German language, what is the original XP
version?

Regards
Chandru
> hiray...@freenet.de

Sam Spiteri

unread,
Mar 12, 2008, 11:45:05 AM3/12/08
to hon...@googlegroups.com
Hi Uwe,

Yes you can run読取革命 on an X machine, as long as you have Japanese
installed on the machine. You should boot up with Japanese set as the
default language so you can read and respond during the installation
process. If you are in German mode you will get 'bake moji' during
installation and 読取革命's menus will be bake, too.

I was not really satisfied with it (but the price was right) or others that
I tried so what I finally decided to do was to go with Adobe Acrobat (the
full version not the Reader). Acrobat will OCR image files and is useful
for doing many other things.

Also, if the original documents were not scanned-in well, the OCR output
will be full of mistakes and will take hours to fix.

HTH

-Sam Spiteri

Alan Siegrist

unread,
Mar 12, 2008, 11:57:22 AM3/12/08
to hon...@googlegroups.com
Hirayama writes:

> is it possible to run OCR software for Japanese texts
> (e.g. 読取革命) in an "international" Windows XP environment?
>
> My computer system runs with Windows XP home edition
> (localized for Germany).

The software should run fine. There is no need for any sort of language pack
since you clearly have Japanese support in your mail. However, you should
make sure that the OCR software that you get is Unicode-compatible.

Since Windows XP supports Unicode natively, any Unicode-compatible software
can theoretically run on any country-localized version of XP.

The problem that may occur is if you are trying to run legacy software that
does not have Unicode support. Such software can be run, but there is a
setting in Windows XP that allows you to set the encoding to be used for
non-Unicode programs, and it is only possible to set this setting to one
encoding (country) at a time.

Thus, if you have both German non-Unicode software and Japanese non-Unicode
software, you can run only one at a time but not both at the same time.

The more recent software is mostly Unicode-based, so newer software should
not give you any trouble.

Regards,

Alan Siegrist
Orinda, CA, USA


Marceline Therrien

unread,
Mar 12, 2008, 12:50:28 PM3/12/08
to hon...@googlegroups.com
>-----Original Message-----
>From: hon...@googlegroups.com [mailto:hon...@googlegroups.com] On
>Behalf Of Hirayama
>Sent: March 12, 2008 6:29
>To: Honyaku E<>J translation list
>Subject: OCR for Japanese Text
>
>
>Dear colleagues,
>
>is it possible to run OCR software for Japanese texts
>(e.g. 読取革命) in an "international" Windows XP environment?
>
>My computer system runs with Windows XP home edition
>(localized for Germany).

I use読取革命 on an English-language XP system.

Change your settings as follows and it will work fine:

Region & Language > Regional Options > Japanese

Region & Language > Advanced > Language for non-unicode programs > Japanese


I'm pretty sure this has been discussed before. Did you check the archives?


Marceline Therrien
J2E Business Translations
Oakland, California, USA


Frode Aleksandersen

unread,
Mar 12, 2008, 12:57:14 PM3/12/08
to Honyaku E<>J translation list
In addition to what others have said, there's a utility you can use
which will allow you to run a non-unicode program in its native
language, without having to change the global setting:

http://www.microsoft.com/globaldev/tools/apploc.mspx

You can also create shortcuts using it, so that you don't have to run
it every time.

/frode

Alan Siegrist

unread,
Mar 12, 2008, 1:48:46 PM3/12/08
to hon...@googlegroups.com
Marceline Therrien writes:

> >is it possible to run OCR software for Japanese texts
> >(e.g. 読取革命) in an "international" Windows XP environment?
> >
> >My computer system runs with Windows XP home edition
> >(localized for Germany).
>
> I use読取革命 on an English-language XP system.
>
> Change your settings as follows and it will work fine:
>
> Region & Language > Regional Options > Japanese
>
> Region & Language > Advanced > Language for non-unicode programs >
> Japanese

Do you know if this setting is necessary for 読取革命?

If so, this could conflict with some legacy German-localized software that
may require this setting to be set to German on Hirayama's system.

Some software does not require this non-Unicode program setting if it is
already Unicode compatible.

Marceline Therrien

unread,
Mar 12, 2008, 2:06:05 PM3/12/08
to hon...@googlegroups.com
>-----Original Message-----
>From: hon...@googlegroups.com [mailto:hon...@googlegroups.com] On


Yes, these settings are necessary for読取革命. That's why I told Uwe to use
these settings.

I wouldn't have taken the time to compose those instructions if the settings
weren't necessary.

And yes, it does screw with some other programs (like Outlook).

I look forward to testing the app suggested by Frode as a possible solution.

Alan Siegrist

unread,
Mar 12, 2008, 2:12:38 PM3/12/08
to hon...@googlegroups.com
Marceline Therrien writes:

> I look forward to testing the app suggested by Frode as a possible
> solution.

Yes, that does look promising.

Chris Loving

unread,
Mar 12, 2008, 6:46:03 PM3/12/08
to Honyaku E<>J translation list
For those that have used both Acrobat and 読取革命, which works better?

I have used the full version of Acrobat and have found that theOCR
works decent, but that when I export to Word the format ends up
significantly different than the original document. I haven't used 読取革
命.

In translating patents, I've found that recent patents (after about
1995) have clearer texts with characters that are OCR'ed with a very
high accuracy, but older patents can get kind of blurry. I wonder if 読
取革命 has better recognition technology.

Chris
Alexandria, VA

bendooley

unread,
Mar 12, 2008, 7:46:54 PM3/12/08
to Honyaku E<>J translation list
I use it all the time, and I have to say it makes life much easier.
Highly recommended.

On Mar 12, 11:12 am, "Alan Siegrist" <AlanFSiegr...@Comcast.net>
wrote:

Hirayama

unread,
Mar 13, 2008, 4:49:00 AM3/13/08
to Honyaku E<>J translation list
Thanks to all contributors to my question.

I think I will purchase that yomitori kakumei
(it is not that expensive) and try it with German
and then with Japanese settings.

I do not expect any difficulties with SW localized
for Germany, and even if there were some problems,
I guess it may be a solution to establish a further
user under Windows XP.

BTW I usually use OpenOffice and OmegaT besides
conventional MS clients for e-mail and internet browsing.

Best wishes,

Uwe Hirayama
hira...@freenet.de
JP, GER
via google groups interface

Sam Spiteri

unread,
Mar 13, 2008, 8:06:20 AM3/13/08
to hon...@googlegroups.com
Hi Chris,

I've used both読取革命 Lite and the Acrobat. Neither of them is really
perfect but I would give読取革命 slightly better points for character
recognition.

However, because my main use of OCR is extracting Japanese text from image
pdf files I decided to just go with the full version of Acrobat as it only
takes a couple of mouse clicks to get the text when you have the pdf file
opened. Although recognition and formatting issues are HIGHLY dependent on
the image quality the same can be said about読取革命.

Also, for any other OCR or scanning needs the software that came with my
All-in-One printer is much easier to use and I like its interface much
better. But, if you don't have any other OCR software 読取革命 would be a
good program to buy especially, as mentioned before, it's not too expensive.


Regards,

Sam


> -----Original Message-----
> From: hon...@googlegroups.com [mailto:hon...@googlegroups.com] On

> Behalf Of Chris Loving
> Sent: Wednesday, March 12, 2008 6:46 PM
> To: Honyaku E<>J translation list

Marc Adler

unread,
Mar 26, 2008, 11:57:00 AM3/26/08
to hon...@googlegroups.com
On Wed, Mar 12, 2008 at 11:57 AM, Frode Aleksandersen
<frode_ale...@hotmail.com> wrote:

> You can also create shortcuts using it, so that you don't have to run
> it every time.

When I run Yomitori Kakumei with this application, it tells me "my
current system settings can already accommodate" the program. (My
current system settings are U.S. English, and Japanese for
non-unicode.) Then I boot up yomitori, and I get question marks (in
some places, not everywhere).

Any ideas what I should do?

--
Marc Adler
Austin, TX

لا شيء إلا الضوء

Michael

unread,
Mar 26, 2008, 2:22:02 PM3/26/08
to hon...@googlegroups.com
>
> > You can also create shortcuts using it, so that you don't have to run
> > it every time.
>
> When I run Yomitori Kakumei with this application, it tells me "my
> current system settings can already accommodate" the program. (My
> current system settings are U.S. English, and Japanese for
> non-unicode.) Then I boot up yomitori, and I get question marks (in
> some places, not everywhere).
>
> Any ideas what I should do?
This is probably not very helpful and you probably are already doing this
but I "just live with it". I also get question marks in certain places but
find that it does not limit the functionality that I need from the software.

Marceline Therrien

unread,
Mar 26, 2008, 2:41:00 PM3/26/08
to hon...@googlegroups.com
>-----Original Message-----
>From: hon...@googlegroups.com [mailto:hon...@googlegroups.com] On
>Behalf Of Michael
>Sent: March 26, 2008 11:22
>To: hon...@googlegroups.com
>Subject: RE: OCR for Japanese Text
>
>
>>

I don't use the program suggested by Frode, but with the settings I
suggested in my earlier post the program works fine with no question marks
anywhere.

Marc Adler

unread,
Mar 26, 2008, 3:04:05 PM3/26/08
to hon...@googlegroups.com
On Wed, Mar 26, 2008 at 1:41 PM, Marceline Therrien
<hon...@thinkjapanese.net> wrote:

> I don't use the program suggested by Frode, but with the settings I
> suggested in my earlier post the program works fine with no question marks
> anywhere.

Right, but setting the entire computer to Japanese screws up
applications that use UIs in other languages. That's why Frode's
program looked so promising.

Reply all
Reply to author
Forward
0 new messages