Interested in letter recognition

19 views
Skip to the first unread message

Tarjei

unread,
26 Feb 2008, 05:11:5226/02/2008
to ShapeLogic
Hi, I just dumped over your project while working on my own little
project which is trying to develop a system to create tests for
images, more spesificaly for webpages.

Se http://browserbeaten.blogspot.com/2008/01/using-selenium-for-getting-full-image.html
for a description.

I stumbled over your project while looking into integrating Maven and
ImageJ of all things, but you seem to have an interesting approach to
letter recognition.

I was planning to use more conventional Bayesian techniques to
recognize letters - or more precisely - font faces. I am mainly
interested in asserting that a font is of a certain size and font
face. Have you seen any good open source Bayesian classification
packages for Java?

Also, do you think your system may be useful for recognizing font
sizes etc as well?

Kind regards,
Tarjei

sami....@gmail.com

unread,
26 Feb 2008, 16:05:2826/02/2008
to ShapeLogic
Hi Tarjei,

Thanks for you interest in ShapeLogic.

<Have you seen any good open source Bayesian classification packages
for Java?>
No, not that I can remember.

I was very excited about this when I read Pearl's book about Bayesian
network around 20 years ago:
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible
Inference.
I have not dabbled in this for quite a while.

<Also, do you think your system may be useful for recognizing font
sizes etc as well?>
ShapeLogic is a toolkit for declarative programming in image
processing, the letter recognition part was just a proof of concept.
Finding the font sizes should be simple just take the height of the
bounding box for the polygons that are letters.

The release of ShapeLogic 1.0 is a few days away. Here is a few
features:

* This is the first beta quality release.
* It should be a lot easier for the user to go in and define their
own rules.
* This can handle more than one polygon at a time.

So if you want to try it out I will recommend that you wait till
ShapeLogic 1.0 is out.

It is always hard to judge the quality of your own work, but I think
that it should save you a few months of work.

But there is a lot of work from you start with ShapeLogic till you
have a useful character recognition program.

Good luck,
-Sami Badawi
http://www.shapelogic.org


On Feb 26, 5:11 am, Tarjei <tar...@nu.no> wrote:
> Hi, I just dumped over your project while working on my own little
> project which is trying to develop a system to create tests for
> images, more spesificaly for webpages.
>
> Sehttp://browserbeaten.blogspot.com/2008/01/using-selenium-for-getting-...

sami....@gmail.com

unread,
27 Feb 2008, 10:57:2027/02/2008
to ShapeLogic
Hi again Tarjei,

I have a few more ideas for your posting.
I was considering using machine learning to generate rules for letter
matching. So I was looking for software that could do that. Here is a
few links I found:

http://www.artima.com/lejava/articles/data_mining.html
http://en.wikipedia.org/wiki/Java_Data_Mining

The top choice I found was a Java software package called Weka:
http://en.wikipedia.org/wiki/Weka_%28machine_learning%29
http://www.cs.waikato.ac.nz/~ml/weka
http://weka.sourceforge.net/wekadoc/index.php/en%3APrimer

The Weka software, is GPL so I cannot directly use that in ShapeLogic,
which is MIT license. But you can use that to help generate the rule
you need for letter matching. Maybe this is what you need.

Even with a good package for Bayesian classification, you will still
need software that can find and analyze the features that you need as
input for the Bayesian classification. ShapeLogic might work for that.

Hope this was helpful,
-Sami Badawi
http://www.shapelogic.org
> -Sami Badawihttp://www.shapelogic.org
Reply all
Reply to author
Forward
0 new messages