Please advise

71 views
Skip to first unread message

myquest

unread,
Oct 16, 2019, 6:44:30 AM10/16/19
to tesseract-ocr
Hi


I am newbie and need help how to start working on Tesseract, Please advise me first few steps, if I have pdf/images datasheets to get data from them.

Should I have to write my application, if you what would be the best way to do it?
If using current code how could I proceed?

Thanks a ton!

René Hansen

unread,
Oct 16, 2019, 6:54:46 AM10/16/19
to tesser...@googlegroups.com
Start by reading the available documentation:

https://github.com/tesseract-ocr/tesseract/wiki

If you feel some of it could be clearer or improved, feel free to request changes and/or provide the updates yourself.


/René


--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/980c837f-8c63-4198-9f42-421dd2d04576%40googlegroups.com.

Leopold Hamminger

unread,
Oct 16, 2019, 7:43:56 AM10/16/19
to tesseract-ocr
I was new a few weeks ago and found tesseract quite easy to use. However, you should know the basics of console input (which I remember from my DOS days). I am using Windows and downloaded the v5 akpha version. As input I use .jpg files, output is a text file. Initially the German "umlaute" did not work, with some help I got here I reinstalled, taking care to include the German language option. I am reasonably happy with the result, it serves my purpose.

Shree Devi Kumar

unread,
Oct 16, 2019, 8:25:01 AM10/16/19
to tesseract-ocr
There are also  third party GUI interfaces for tesseract. The ones that I have used at times are vietocr and gimagereader. 

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

INAM HAQ

unread,
Oct 16, 2019, 12:47:04 PM10/16/19
to tesser...@googlegroups.com
Thank you all for kind advise. I have attached a image please advice me how to proceed to get the data from it as it is kind of mess.

To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVeLHSQbTiPpwC-qLsvV-wxJF3wVeOUY4QNwZRKdXYxiA%40mail.gmail.com.

Thank you all for kind advise. I have attached a image please advice me how to proceed to get the data from it as it is kind of mess.  
test1001.jpg

Zdenko Podobny

unread,
Oct 17, 2019, 6:42:24 AM10/17/19
to tesser...@googlegroups.com
Show us what did you already tried.

Zdenko


st 16. 10. 2019 o 18:46 INAM HAQ <inamc...@gmail.com> napísal(a):

INAM HAQ

unread,
Oct 17, 2019, 8:25:22 AM10/17/19
to tesser...@googlegroups.com
Please guide how and where to start,  I have idea to write a script? 

Zdenko Podobny

unread,
Oct 17, 2019, 9:12:20 AM10/17/19
to tesser...@googlegroups.com
People at forum already point you to docs&solutions. So it is put you how you will start.
We can help you, but we will not do your job.

Zdenko


št 17. 10. 2019 o 14:25 INAM HAQ <inamc...@gmail.com> napísal(a):

INAM HAQ

unread,
Oct 17, 2019, 12:13:43 PM10/17/19
to tesser...@googlegroups.com
You are absolutely right,  I have to do my job, but I am kind of confused should I have to modify the existing code or to write my script in Python?
I have read all the documents but little bit struggling. My request just give me the starting point then I will submit what I achieve.\

Thanks a ton!

Shree Devi Kumar

unread,
Oct 17, 2019, 12:32:48 PM10/17/19
to tesseract-ocr
Have you seen the wiki page


There are also multiple existing projects using tesseract with python.


Reply all
Reply to author
Forward
0 new messages