Tesseract GUI. See http://ocrgui.sourceforge.net/

172 views
Skip to first unread message

emanuele sicchiero

unread,
Sep 12, 2009, 9:37:22 AM9/12/09
to tesseract-ocr
Hello everybody!

I made a GUI for Tesseract: OcrGui. This a simple program written in C
using Glib and Gtk+. Download it from http://ocrgui.sourceforge.net/.
I'd like if this link could be added to Tesseract's home page. It
would be fantasic!

Bye
Emanuele

M. Bashir Al-Noimi

unread,
Sep 12, 2009, 9:58:44 AM9/12/09
to tesseract-ocr
Great, but what about Windows and ubuntu binaries?

On Sep 12, 4:37 pm, emanuele sicchiero <emanuele.sicchi...@gmail.com>
wrote:
> Hello everybody!
>
> I made a GUI for Tesseract: OcrGui. This a simple program written in C
> using Glib and Gtk+. Download it fromhttp://ocrgui.sourceforge.net/.

emanuele sicchiero

unread,
Sep 14, 2009, 2:52:10 AM9/14/09
to tesser...@googlegroups.com
You're right but at the moment OcrGui is only for Linux, because it supports Gocr too. I haven't knowledge about using gtk and glib under windows and about making windows binary. Unfortunaly my free time is not much. Anyway i will try to make it and i will also try to make ubuntu binaries.
What do you think about the program? Could it have some other features?

Bye
Emanuele

2009/9/12 M. Bashir Al-Noimi <mbn...@gmail.com>

M. Bashir Al-Noimi

unread,
Sep 14, 2009, 7:51:12 AM9/14/09
to tesseract-ocr
Actually I couldn't test OcrGui because I've ubuntu and I couldn't
find time to compile it. for that I asked you about its binaries for
ubuntu.

Any way, I looked at OcrGui screenshots and I found them really
perfect.

P.S.
You can create cross-platform installer for All linux distros and for
windows too by using http://installjammer.com/ I'm using it for whole
projects I create.

On Sep 14, 9:52 am, emanuele sicchiero <emanuele.sicchi...@gmail.com>
wrote:
> You're right but at the moment OcrGui is only for Linux, because it supports
> Gocr too. I haven't knowledge about using gtk and glib under windows and
> about making windows binary. Unfortunaly my free time is not much. Anyway i
> will try to make it and i will also try to make ubuntu binaries.
> What do you think about the program? Could it have some other features?
>
> Bye
> Emanuele
>
> 2009/9/12 M. Bashir Al-Noimi <mbno...@gmail.com>

Fuad Jamour

unread,
Sep 15, 2009, 5:40:25 AM9/15/09
to tesser...@googlegroups.com
Dear Emanuele,

I successfully compiled your software, the compilation went smooth. The program is pretty nice and clean, Great job!

I have some notes:
  • It's not possible to change the language code passed to Tesseract, if it's not true please tell me how to change it. In the Preferences window, there are some preloaded languages in Tesseract tab, but it's not modifiable.
  • When I open one document, then open another one, the program sticks to the first one and does not recognize the second one.

emanuele sicchiero

unread,
Sep 15, 2009, 6:32:21 AM9/15/09
to tesser...@googlegroups.com
Hi Fuad,
I will try to answer you.

1) Yes, it is possible to change the language code. In the preferences windows you can see all the languages that Tesseract supports at the moment. You can click on the combo box and select the language. There is only an issue. You can select a language even if the correspondign dictionary is not installed in the system. This selection leads to an error during the text extraction process. The error is diplayed, but it's not "user friendly", it should be better to modify the preferences combo box, so the user is allowed only to select installed languages. I'm already working on this problem.

I think this is the answer to your question, but maybe you are saying that the languages combo box is not modifiable? In this case there could be some problem with combo box...please tell me exactly which operations you do in the preferences window.

2) To recognize the second document:
    - click on the second document image on the left vertical panel
      then the text area will became blank and you can see the document image
    - click on the Recognize button to extract the text
      then the text will appear in the text area.
    - at this point you have two possibilities:
           a) save the two texts in two different files -> select only open document and click on Save button
           b) save the two texts in an unique file -> select both document (with Ctrl-Click) and click on Save button
          
      In case b) you can look at the preferences window before, where you can set the number of blank rows to put between the two   texts.

Yes...maybe it's complicated...

Thank you very much to try my program. If you have other questions or suggestions, you're welcome.

Bye
Emanuele

2009/9/15 Fuad Jamour <fja...@gmail.com>

74yrs old

unread,
Sep 15, 2009, 12:31:38 PM9/15/09
to tesser...@googlegroups.com
website address from where I download the GUI?
-sriranga(76yrsold)

emanuele sicchiero

unread,
Sep 16, 2009, 2:44:44 AM9/16/09
to tesser...@googlegroups.com
Go http://ocrgui.sourceforge.net/
and click on Download section. You can find source tarball and rpm. Deb package there isn't because I'm learning how to create it.

Bye
Emanuele


2009/9/15 74yrs old <withbl...@gmail.com>

76yrsold

unread,
Sep 16, 2009, 5:26:30 AM9/16/09
to tesseract-ocr
Thanks for the website quoted. visited but I could not locate the
software for windows
(WinXP) platform for download. It appears that source code for windows
platform is not available
for download.
With Regards,
-sriranga(76yrsold)

On Sep 16, 11:44 am, emanuele sicchiero <emanuele.sicchi...@gmail.com>
wrote:
> Gohttp://ocrgui.sourceforge.net/
> and click on Download section. You can find source tarball and rpm. Deb
> package there isn't because I'm learning how to create it.
>
> Bye
> Emanuele
> <http://ocrgui.sourceforge.net/>
>
> 2009/9/15 74yrs old <withblessi...@gmail.com>
> >> 2009/9/15 Fuad Jamour <fjam...@gmail.com>
>
> >> Dear Emanuele,
>
> >>> I successfully compiled your software, the compilation went smooth. The
> >>> program is pretty nice and clean, Great job!
>
> >>> I have some notes:
>
> >>>    - It's not possible to change the language code passed to Tesseract,
> >>>    if it's not true please tell me how to change it. In the Preferences window,
> >>>    there are some preloaded languages in Tesseract tab, but it's not
> >>>    modifiable.
> >>>    - When I open one document, then open another one, the program sticks

emanuele sicchiero

unread,
Sep 16, 2009, 5:35:30 AM9/16/09
to tesser...@googlegroups.com
Sorry, but the program is not for Windows at the moment. I konw Tesseract has a Windows version and I will try to make a Windows version of OcrGui too.

Bye
Emanuele

2009/9/16 76yrsold <withbl...@gmail.com>

76yrsold

unread,
Sep 16, 2009, 5:37:14 AM9/16/09
to tesseract-ocr
I hope your Gui will support Indic languages like Hindi, telugu,
Kannada, Tamil, Bangla etc
In fact, I have developed tessdata for tesseractocr in Kannada.
-sriranga(76yrsold)

emanuele sicchiero

unread,
Sep 16, 2009, 5:45:21 AM9/16/09
to tesser...@googlegroups.com
No sorry. The OcrGui only supports european languages (English, French, German, Italian, Dutch, Portuguese, Spanish) as you can see from the Preferences window in the tab Tesseract. It'ss a good idea to include also Indic languages. I will think about this and maybe I will ask you for some example image to test the gui.

Bye
Emanuele

2009/9/16 76yrsold <withbl...@gmail.com>

74yrs old

unread,
Sep 16, 2009, 5:53:49 AM9/16/09
to tesser...@googlegroups.com
Thank you. I am ready to assist you by way of beta testing and feedback to you. In this connection,Softi- FreeOCR  supports all languages - only drawback is there is  no spell checker. This is for your information
With Regards,
-sriranga(76yrsold)

74yrs old

unread,
Sep 16, 2009, 5:55:46 AM9/16/09
to tesser...@googlegroups.com
Thank you very much for the goods news of windows version of GUI.
I shall wait for the same for download.
At present there is no Kannada dictionary is available Is it possible to make provision for  blank(dummy) spell checker so that user can add/delete words manually till Kannda dictionary is available.  OR alternatively whether user can delete the contents of the existing dictionary say english and replace with the choice of his lang say Kannada.
With Regards,
-sriranga(76yrsold)

emanuele sicchiero

unread,
Sep 16, 2009, 6:23:43 AM9/16/09
to tesser...@googlegroups.com
I'm not sure I'm able to make a Windows version because OcrGui supports also Gocr, another Ocr program, and not only Tesseract. Unfortunately Gocr is for Linux only (but i'm not so sure...i will verify). Anyway I will try to create a Windows version, but not soon. I haven't much time. If someone wants to join the project to make Windows version, is welcome.

At the moment you must select a dictionary for Tesseract and a dictionary for spellcheck, but I know that dictionary is not mandatory for Tesseract. So I'm thinking to give possibility not to select a dictionary for Tesseract in the OcrGui's Preferences window. This is (I think) what yor're asking for.


2009/9/16 74yrs old <withbl...@gmail.com>

74yrs old

unread,
Sep 16, 2009, 6:57:19 AM9/16/09
to tesser...@googlegroups.com
 At present Gocr for windows(windows-binary gocr048.exe) is also available 
Reg:dictionary = Even though dictionary is not mandatory for tesseract - it is a must because generally output will have 80-90% accuracy from my experience. Really I am badly need of spell checker, as such I requested you add spell checker -as a special case - in your GUI. In nutshell you  must to give possibility to select a dictionary for tesseract in the OCRGUI's preferences window. I feel there is no harm to have option"blank spell checker" as suggested by me to enable the users to add/delete the words of his language. Thus it will be more benefited for all users of different languages.
With Best of Luck,
-sriranga(76yrsold)

Tenzin Dendup

unread,
Sep 16, 2009, 11:24:52 AM9/16/09
to tesser...@googlegroups.com
Hi Emanuele,

Thanks for the program OcrGui. I compiled and installed it on my
Debian Squeeze. I just found one problem.
When i go to File -> Preferences, ocrgui crashes with error messege which says:

** ERROR **: Duplicate object id 'image1' on line 1463 (previously on line 1316)
aborting...
Aborted

I am just wondering what the problem could be or if others are also
getting this error.

Regards
Tenzin

emanuele sicchiero

unread,
Sep 18, 2009, 4:40:35 AM9/18/09
to tesser...@googlegroups.com
Hi Tenzin,
i just take a look at lines showed in the error below, but i didn't understand the problem. I should have a system similar to yours. I tried OcrGui in Suse 11.1 and Ubuntu jaunty but I never saw that error and also others who installed
I hope to find time to create a deb package, so you could try to install it.

Bye
Emanuele

2009/9/16 Tenzin Dendup <tenzin...@gmail.com>
Reply all
Reply to author
Forward
0 new messages