Hi Jon,
Like each morning, I check my emails and I saw those headstones Images from Graves. I am a God fearing person. So, I was not able to ignore your email.
Regarding the preprocessing step, I suggest to apply Local Minima method for background removal. However, you might require to adjust your window size in order to achieve the best results. I did some experiments with the MATLAB code, and I got some good results. Testing on a larger sample set, may improve the step.
Please tell me what project you are working on, maybe I will be able to contribute better? Just lemme know if you need any type of help!
Best Regards,
Vicky
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com.
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en.
I don't know if it's intended but all your links to images report
"We're sorry. The page you tried to access is not available". In that
way nothing can be advised on your issue...
Warm regards,
Dmitry Silaev
Dear Jon,
Try to analyze with some preprocessing steps as belows:
Step1: Detect ROI
https://picasaweb.google.com/congnguyenba/TesseractBasedOCR#5576366756516993234
Setp2: Apply low-pass fft filter, with parameters:
- intensity threshold is 130
- fft cutoff: 15%
https://picasaweb.google.com/congnguyenba/TesseractBasedOCR#5576366759922523650
Step3: Scale image with scale factor
https://picasaweb.google.com/congnguyenba/TesseractBasedOCR#5576366756371708834
Step4: try to recognize use Tesseract/others
https://picasaweb.google.com/congnguyenba/TesseractBasedOCR#5576366764338605922
Step5: post-processing requires????
Good luck,
Cong.
--
Hi KIP,
I am a Hindu, but not converged to a particular God!
The thing that made me so sensitive about that email Images were as follows:
1- Whenever a person is dead in India and he/she is being taken to last cremention ground on the way, whenever or whoever looks that ‘yatra’ always remember his God, he believes into
2- When you are looking at the headstone-images that is of someone dead (no longer alive) you have to pay respect and I pay it by remembering my God plus simply forwarding all of my help to Jon
3- Last, I am currently working on background removal from medical scans in DICOM images, so it was technically related!
I hope that answers your questions. If not, let us discuss this offline otherwise we can be painted as an OT J
Hi KIP,
I am a Hindu, but not converged to a particular God!
The thing that made me so sensitive about that email Images were as follows:
1- Whenever a person is dead in India and he/she is being taken to last cremention ground on the way, whenever or whoever looks that ‘yatra’ always remember his God, he believes into
2- When you are looking at the headstone-images that is of someone dead (no longer alive) you have to pay respect and I pay it by remembering my God plus simply forwarding all of my help to Jon
3- Last, I am currently working on background removal from medical scans in DICOM images, so it was technically related!
I hope that answers your questions. If not, let us discuss this offline otherwise we can be painted as an OT J
Best Regards,
Vicky
From: tesser...@googlegroups.com [mailto:tesser...@googlegroups.com] On Behalf Of Kip Hughes
Sent: Monday, February 21, 2011 10:00
To: tesser...@googlegroups.com
The code I have written is in MATLAB. Will you be able to convert it into
OpenCV code? Lemme know.
In OpenCV if you apply simple thresholding, it should work. My method
(local-minima) is a little complicated (and accurate) then simple
thresholding. Therefore, hard to implement in C++ because of interpolation
step. I think OpenCV can do this, but we need to have a closer look for this
step.
Best Regards,
Vicky
-----Original Message-----
From: Jon Andersen [mailto:jand...@gmail.com]
Sent: Monday, February 21, 2011 23:42
To: Vicky Budhiraja
Subject: Re: Image pre-processing for good OCR results
Vicky,
Thank you so much for responding! I appreciate your help with this
project.
I have taken thousands of photos of headstones, and am trying to use
Tesseract on them. I will make the results available through
findagrave.com, so that people can search for their relatives.
Here is a whole directory of sample images:
http://freepages.genealogy.rootsweb.ancestry.com/~janderse/cemeteries/Star%2
0of%20David%20Memorial%20Gardens/Garden%20of%20Haifa/
Could you send me the code or results that you found? I am trying to
use OpenCV to do the image pre-processing.
Thanks!!!
-Jon
You will certainly need to implement mostly the steps that Cong Nguyen
suggests. However complications arise if you wish to do pre-processing
in a pure automatic way. You are going to precess real photographic
images, and therefore fonts, backgrounds, lighting conditions, etc.
differ much. And that's why a "one fits all" method (particularly for
ROI detection and background removal) won't work. You will encounter
that your fixed pipeline works fine with the first and second images
but fails with the third one.
There are two possible ways to solve this. If you still want to do it
automatically you'll need to choose several algorithms for every
pipeline stage and implement a logic that would automatically, based
on some metric, decide for each image which algorithm would work (or
have worked) best. Or you can give up automatic approach and switch to
manual selection of pre-processing scenarios for each image according
to your experience.
The next complication is getting results from Tesseract. Since the
quality of text in photographic images is really low, usually you
can't rely on that Tesseract's top-choice recognition results
represent actual text. Imho the best approach here is to get all
Tesseract's choices for every character and then remove uncertainty
using language model (bigram and trigram statistics). This is the best
you can do because dictionary won't help you much, at least for last
names.
And then you'll have to locate names within the recognition results.
The first problem here is in that they can be few per headstone. The
second one is in that Tesseract will try to recognize as text
everything it sees in the image, including noise left from
pre-processing. So this task can also pose some difficulties. But this
seems to be mainly a question of engineering, not of research...
To conclude, it all depends on how serious you are about investing
your time and efforts into your project ))
HTH
Warm regards,
Dmitry Silaev
Dear Andres,
The recognition results which I showed, have achieved after I had used my simple tesseract engine 3.01 .net wrapper (link here: http://code.google.com/p/tesseractdotnet/).
ROI detection is cropping ROI manually, after that I used my company software to filter.
About filtering, you can analyze on control set to find out solution to estimate parameters feasibly.
Thanks,
Cong.
From:
tesser...@googlegroups.com [mailto:tesser...@googlegroups.com] On
Behalf Of Andres
Sent: Wednesday, February 23, 2011 4:02 AM
To: tesser...@googlegroups.com
Subject: Re: Image pre-processing for good OCR results
Hello,
Dear Jon,
Beginning for analyzing; I try also to detect lines, corners; but results are not good. I think due to images are low contrast.
Please try to analyze with some data line profiles:
ROI-left-profile:
https://picasaweb.google.com/congnguyenba/TesseractBasedOCR#5576706091073985362
ROI-top-profile:
https://picasaweb.google.com/congnguyenba/TesseractBasedOCR#5576706094761082706
ROI-right-profile:
https://picasaweb.google.com/congnguyenba/TesseractBasedOCR#5576706102033630978
ROI-bottom-profile:
https://picasaweb.google.com/congnguyenba/TesseractBasedOCR#5576706106389606898
After doing ROI detection, may be you need to align image.
My solution for this step is:
- detect all lines (Hough transform approach), and then keep all lines have slops are similar to horizontal lines.
- Estimate base-slop based on mean slop
- Align image
Here are detected lines:
https://picasaweb.google.com/congnguyenba/TesseractBasedOCR#5576709473940745778
Hope it’s helpful to you!
Good luck,
Cong.
I guess I'm a bit surprised that no one has yet mentioned the fact
that the Leptonica C Image Processing Library
(http://www.leptonica.com) is now required to build tesseract-ocr --
or soon will be... the current state of tesseract-ocr is a bit hazy.
My understanding is that eventually (not in the near future though)
tesseract-ocr will only use Leptonica PIXs as its in-memory image
representation.
A still unofficial, easier to read, Sphinx generated version of the
Leptonica documentation is at
http://tpgit.github.com/UnOfficialLeptDocs/. Dan is currently
hammering away at v1.68 and it should be out soon (this week?). At
which point I'll also update my unofficial version of the
documentation.
My admittedly quick/biased opinion was that OpenCV focused on Computer
Vision and that Leptonica has more "pure" Image Processing routines. I
also find Leptonica's source code fairly easy to read because one of
the purposes of the library is to try to teach image processing
concepts.
In any case, if you're planning on using tesseract-ocr 3.x, then you
already must have liblept, so you might as well try it out.
-- TP
It's a side-effect of support for Japanese, Chinese, etc.
> We have to recognize text we don’t know in advance the orientation, and I
> know that Leptonica should be used for page layout analysis.
> However, does tesseract offers internal facilities to recognize text
> orientation?
> And if so, how to activate these facilities or at least to return tentative
> baselines?
There's an orientation/script detection module in the 3.01 code, but I
haven't even tried to use it, so I couldn't say.
--
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.
-----Messaggio originale-----
From: patrickq
Sent: Monday, February 28, 2011 7:44 PM
To: tesseract-ocr
Subject: Re: text rotated upside down or of 90�
ScanBizCards (iPhone version) is using the Tesseract 3.0 orientation
detection, works quite well - accurate in 95%+ of cases and the 5%
failure cases are oftentimes because we scan business cards where
there isn't a lot of text to go by + there is a lot of non-text
confusing the detection.
Patrick
On Feb 28, 1:35 pm, "Jimmy O'Regan" <jore...@gmail.com> wrote:
> On 28 February 2011 15:17, Giuseppe Menga <me...@polito.it> wrote:
>
> > at Politecnico di Torino we are using the release 3.0.0 of tesseract,
> > with
> > the standard english training.
> > Obviously the software doesn�t recognize pages of text rotated upside
> > down
> > and we would not expect it does, however with surprise, it recognizes
> > with a
> > little worse performance text rotated of 90� counter clockwise, but not
> > clockwise.
> > How that is possible?
>
> It's a side-effect of support for Japanese, Chinese, etc.
>
> > We have to recognize text we don�t know in advance the orientation, and
> > I
> > know that Leptonica should be used for page layout analysis.
> > However, does tesseract offers internal facilities to recognize text
> > orientation?
> > And if so, how to activate these facilities or at least to return
> > tentative
> > baselines?
>
> There's an orientation/script detection module in the 3.01 code, but I
> haven't even tried to use it, so I couldn't say.
>
> --
> <Leftmost> jimregan, that's because deep inside you, you are evil.
> <Leftmost> Also not-so-deep inside you.
--
Could you post some samples to analyze?
If you are afraid that tesseract page layout doesn't work on rotated image,
you can run step-by-step as belows:
1. Firstly, you can call tesseract to FindLinesCreateBlockList (have a look
at TessBaseAPI class), you should achieved a BLOCK_LIST.
2. Now, please check BLOCK_LIST:
I showed here only member fields:
...
ROW_LIST rows; //< rows in block
...
FCOORD skew_; //< Direction of true horizontal.
ICOORD median_size_; //< Median size of blobs.
And here are ROW class:
....
inT32 kerning; //inter char gap
inT32 spacing; //inter word gap
TBOX bound_box; //bounding box
float xheight; //height of line
float ascrise; //size of ascenders
float descdrop; //-size of descenders
WERD_LIST words; //words
QSPLINE baseline; //baseline spline
...
A page included block(s), and a block included row(s)....
3. Try to visualize any things you need to have an overview of
segmentation/detection step worked...
Also, if you want to understand how to tesseract works, please read some
papers in doc folder, they have been published by Ray.
Hope it's helpful to you!
Cong.