Recognising text on whiteboard in photo shoot

63 views
Skip to first unread message

Damien Hartwig

unread,
Jul 28, 2022, 4:27:55 AM7/28/22
to tesseract-ocr
Hi

I am trying to detect the photo shoot identifier image in a batch of images.  This photo always has a whiteboard with text written on it.  I am not interested in the handwritten text, just the printed text on the whiteboard.  Detecting the words "Photo" or "Shoot" in the photo will be enough to identify this image.
C5H11295.JPG

I have tried to identify these words with gImageReader, and it works fine.  But when trying to do the OCR with tesseract in the command line (version 5.2 on Windows 64-bit) I don't get any text being returned in the result.  My understanding is the gImageReader uses the tesseract engine, so why am I getting a result with that, and not directly from the command line?

Any assistance will be appreciated.

Thanks.

Yunlong Liu

unread,
Jul 28, 2022, 5:11:08 AM7/28/22
to tesseract-ocr
Could it be because gImageReader has some extra preprocessing steps before calling Tesseract to do the actual recognition work?

Damien Hartwig

unread,
Jul 28, 2022, 5:16:58 AM7/28/22
to tesser...@googlegroups.com
Any idea how I can see what gImageReader is doing and replicate this in the command line with tesseract?

This email and any attachment(s) it may contain is confidential and is intended solely for the use of the individual(s) to whom it is addressed. If you are not the intended recipient of this email, you must not take action based on the contents, nor distribute, nor expose any part of the content(s) to entities or person(s) beyond the original distribution list. Please contact the sender and delete the email if you have received it in error. Thank you.

--
You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/ZYK154zlJFA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/acee0e05-dc02-434a-b3da-b40574c6ce6fn%40googlegroups.com.
Message has been deleted

Yunlong Liu

unread,
Jul 28, 2022, 5:24:09 AM7/28/22
to tesseract-ocr
https://github.com/manisandro/gImageReader it is open source, you can compile and learn what it does with your example.

Damien Hartwig

unread,
Jul 28, 2022, 5:31:59 AM7/28/22
to tesser...@googlegroups.com
Ignoring the fact that it works in gImageReader, and just looking at the photo, how can I improve the results?  

On Thu, 28 Jul 2022 at 11:22 AM, 'Yunlong Liu' via tesseract-ocr <tesser...@googlegroups.com> wrote:
I don't know what gImageReader is, but luckily it seems to be open source https://github.com/manisandro/gImageReader. Maybe you can tried to compile it and learn what it does under debug mode.
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAAW%2BXc2%3DHN%2BgP0iKw%2BvQDf9i1n0my6C1UEnBKPdggoFFC%3DSe%3Dg%40mail.gmail.com.


--
Yunlong Liu
刘运龙
Vision Engineer Team Lead
视觉算法经理
LMI Technologies
28F Nison Plaza, 205 West Suzhou Avenue, Industrial Park, Suzhou, China
苏州市工业园区苏州大道西205号尼盛广场28楼

Mobile+86 13862426046
Office+86 512 8718 2787
Fax+86 512 8718 2770
Emailyl...@lmi3d.com
Webwww.lmi3D.com
  

This email and any attachment(s) it may contain is confidential and is intended solely for the use of the individual(s) to whom it is addressed. If you are not the intended recipient of this email, you must not take action based on the contents, nor distribute, nor expose any part of the content(s) to entities or person(s) beyond the original distribution list. Please contact the sender and delete the email if you have received it in error. Thank you.

--
You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/ZYK154zlJFA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tesseract-oc...@googlegroups.com.

Zdenko Podobny

unread,
Jul 28, 2022, 6:03:57 AM7/28/22
to tesser...@googlegroups.com
Did you try reading tesseract documentation?

Zdenko


št 28. 7. 2022 o 11:31 Damien Hartwig <damien...@gmail.com> napísal(a):
Reply all
Reply to author
Forward
0 new messages