Template matching for lower descending vowel signs

13 views
Skip to first unread message

pranay prateek

unread,
Apr 12, 2011, 6:54:06 AM4/12/11
to indi...@googlegroups.com



Hi

I have been communicating with Debayan regarding the issue of detecting lower descending vowel
signs. In my opinion, there are some cases where the histogram method might not work as well as expected.
We need to look out for more techniques to solve this issue.

I have posted regarding using template matching for lower descending vowels here:

http://www.pranayprateek.com/?page_id=162

Of course, it works well because template and searched image are very clean and of the same font.

Can some of you provide me with more data to test on, with possibly more noise and different font and styles.
Want to check how well this correlation method works. Or may be people on the list can try out themselves
how this works and let me know of the outcome.

Pranay

http://twitter.com/pranay01

Debayan Banerjee

unread,
Apr 12, 2011, 12:36:05 PM4/12/11
to indi...@googlegroups.com
On 12 April 2011 16:24, pranay prateek <prana...@gmail.com> wrote:
>
>
>
> Hi
> I have been communicating with Debayan regarding the issue of detecting
> lower descending vowel
> signs. In my opinion, there are some cases where the histogram method might
> not work as well as expected.

We could first run the histogram minima algo. Once this algo gives us
a region of certainty, we can then do template matching in that
region. Since template matching is likely to be time intensive, this
will reduce it's running time considerably.
There are some ways to ascertain that the histogram minima method has
given us the wrong region. One way is to look at the height of the
region box. This height should be a constant percentage of the hight
of the line (or the word or the character) itself. If we see that the
height is beyond (or below) this percentage by some margin, we can
then go on and run the template matching code.

--
Debayan Banerjee

pranay prateek

unread,
Apr 13, 2011, 12:18:06 AM4/13/11
to indi...@googlegroups.com, Debayan Banerjee
Agree with you. We should try a combination of histogram minima and template matching
method to get a good performance. Though firstly, I think we should test our algos on some 
more practical dataset. The data on which we have been running our algos are very clean
and perfect, and don't give us the real picture of challenges involved. Can we get a  more
representative data from somewhere?

Pranay 


Abhaya Agarwal

unread,
Apr 13, 2011, 12:48:57 AM4/13/11
to indi...@googlegroups.com
Try the attached image. Fetched randomly from the Digital Library of India collection. My guess is that some kind of preprocessing has been done on it. But still should be good enough for testing the algo.

Regards,
Abhaya
--
-------------------------------------------------
blog: http://abhaga.blogspot.com
Twitter: http://twitter.com/abhaga
-------------------------------------------------
00000011.tif

Dr. Atul Negi

unread,
Apr 13, 2011, 1:12:40 AM4/13/11
to indic-ocr
Template matching for lower descending vowels in general would not
be scalable. Template matching as such means we need to create
templates.

So then question is we need templates for all fonts and font sizes.
Right ?

In my opinion take a more conservative approach to see first if there
is
lower descending vowel sign. How these signs could be distinguished
from
the extensions of the characters like "ra" or "ha" as I saw in the
image posted by
Abhaya.

I am just suggesting that in the long run a more generic approach may
be better.

So to do that can we get a lower baseline ? How can we get a robust
estimate of a lower baseline?
How can we say that something extending below IS a vowel descender? By
its attachement at the base
of the character ? Is it something that is flagged and taken up at the
recognition level ?




On Apr 13, 9:48 am, Abhaya Agarwal <abhaya.agar...@gmail.com> wrote:
> Try the attached image. Fetched randomly from the Digital Library of India
> collection. My guess is that some kind of preprocessing has been done on it.
> But still should be good enough for testing the algo.
>
> Regards,
> Abhaya
>
> On Wed, Apr 13, 2011 at 9:48 AM, pranay prateek <pranay.i...@gmail.com>wrote:
>
>
>
>
>
> > On Tue, Apr 12, 2011 at 10:06 PM, Debayan Banerjee <debaya...@gmail.com>wrote:
> 00000011.tif
> 65KViewDownload

Debayan Banerjee

unread,
Apr 13, 2011, 1:26:52 AM4/13/11
to indi...@googlegroups.com
On 13 April 2011 10:42, Dr. Atul Negi <atul...@gmail.com> wrote:
> Template matching for lower descending vowels in general would not
> be scalable. Template matching as such means we need to create
> templates.


I agree template matching is not scalable.
What we could do is call Tesseract's own functions for recognition
from it's API (char* TesseractRect() from
http://code.google.com/p/tesseract-ocr/source/browse/trunk/api/baseapi.h
(line 277) ). We will anyways have to integrate our solutions within
Tesseract at the end of the day, so we will end up using a lot of
Tesseract functions. Most importantly Tesseract's character matchign
method is font size independent, because it matches size independent
characteristics of the image, and also stores size independent
characteristics of the template during training. See "Feature" heading
in http://tesseract-ocr.repairfaq.org/.


--
Debayan Banerjee

Abhaya Agarwal

unread,
Apr 13, 2011, 1:52:21 AM4/13/11
to indi...@googlegroups.com
I believe that we need a feedback loop between segmentation and recognition to achieve high accuracy. Building it on top of a black box recognizer is worth a try though there might be better ways of doing it.

With that in mind, one approach can be to not generate a fixed segmentation but generate multiple segmentations with associated probabilities. Each of these segmentations are then fed to Tesseract engine which assigns a score. I am not very familiar with API but we can do it efficiently by caching the segmentation parts which are common.

The segmentation probabilities, engine confidence score and a strong language model are then combined to get the final output.

Abhaya

Debayan Banerjee

unread,
Apr 13, 2011, 1:55:15 AM4/13/11
to indi...@googlegroups.com
On 13 April 2011 11:22, Abhaya Agarwal <abhaya....@gmail.com> wrote:
> I believe that we need a feedback loop between segmentation and recognition
> to achieve high accuracy. Building it on top of a black box recognizer is
> worth a try though there might be better ways of doing it.
> With that in mind, one approach can be to not generate a fixed segmentation
> but generate multiple segmentations with associated probabilities. Each of
> these segmentations are then fed to Tesseract engine which assigns a score.
> I am not very familiar with API but we can do it efficiently by caching the
> segmentation parts which are common.
> The segmentation probabilities, engine confidence score and a strong
> language model are then combined to get the final output.

Everything you have described above, Tesseract already does.
http://tesseract-ocr.googlecode.com/files/TesseractOSCON.pdf

--
Debayan Banerjee

Abhaya Agarwal

unread,
Apr 13, 2011, 2:35:44 AM4/13/11
to indi...@googlegroups.com
Everything you have described above, Tesseract already does.
http://tesseract-ocr.googlecode.com/files/TesseractOSCON.pdf

Arr.. actually nothing in that document suggests that it does. But before I try to explain further, let me go back and explore more of the Tesserect API and then come back to the group.

One quick question: If the Tesserect recognition engine is already taking care of segmentation, so whatever algorithm we are trying to develop for ascending and descending marker separation will be included somewhere inside that engine?

Regards,
Abhaya

Debayan Banerjee

unread,
Apr 13, 2011, 4:11:07 AM4/13/11
to indi...@googlegroups.com
On 13 April 2011 12:05, Abhaya Agarwal <abhaya....@gmail.com> wrote:
>> Everything you have described above, Tesseract already does.
>> http://tesseract-ocr.googlecode.com/files/TesseractOSCON.pdf
>
> Arr.. actually nothing in that document suggests that it does. But before I
> try to explain further, let me go back and explore more of the Tesserect API
> and then come back to the group.

http://static.googleusercontent.com/external_content/untrusted_dlcp/www.google.com/en//research/pubs/archive/35248.pdf
is the best place to learn what Tesseract does.

Yes our aim would be to include whatever algos we are developing
inside the engine. That means we can use Tesseract's baseline finding
method, character recognition engine, dictionary among many other
things.


--
Debayan Banerjee

Debayan Banerjee

unread,
Apr 13, 2011, 4:12:10 AM4/13/11
to indi...@googlegroups.com
On 13 April 2011 12:05, Abhaya Agarwal <abhaya....@gmail.com> wrote:
>> Everything you have described above, Tesseract already does.
>> http://tesseract-ocr.googlecode.com/files/TesseractOSCON.pdf
>
> Arr.. actually nothing in that document suggests that it does. But before I
> try to explain further, let me go back and explore more of the Tesserect API
> and then come back to the group.

Well slide 11 does tell you that the character chopper talks to the 4
sub-systems that recognise.


--
Debayan Banerjee

Reply all
Reply to author
Forward
0 new messages