Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

OCR (Optical Character Recognition) and Mathematics

1,217 views
Skip to first unread message

John Goche

unread,
Jul 16, 2009, 5:27:51 AM7/16/09
to

Hello,

How difficult would it be to use OCR to scan some
mathematics (such as mathematics produced using
the LaTeX typesetting system) and have a LaTeX
source file output as a result of applying OCR
software on the scanned in mathematics?
What are the current free and comercial
OCR software available for this purpose?
Finally does anyone know whether
Google uses its own home-brewed
OCR software to scan patents
and other documents or do
they use publicly available
free or commercial software
for their purposes?

Thanks,

John Goche

Christian Stapfer

unread,
Jul 16, 2009, 11:14:50 AM7/16/09
to
"John Goche" <johng...@googlemail.com> schrieb im Newsbeitrag
news:6d519142-6b0a-4131...@c1g2000yqi.googlegroups.com...

>
> Hello,
>
> How difficult would it be to use OCR to scan some
> mathematics (such as mathematics produced using
> the LaTeX typesetting system) and have a LaTeX
> source file output as a result of applying OCR
> software on the scanned in mathematics?
> What are the current free and comercial
> OCR software available for this purpose?
The main problem would be, IMHO, to recognize formulas in
scanned images. Outputting those formulas as LaTeX, once
they are recognized, seems like a very minor problem to me.

One program that tries to recognize formulas in scanned images is

http://www.inftyproject.org/en/software.html#InftyReader

Regards,
Christian

Dr Engelbert Buxbaum

unread,
Jul 16, 2009, 12:08:35 PM7/16/09
to
Am 16.07.2009, 05:27 Uhr, schrieb John Goche <johng...@googlemail.com>:

> How difficult would it be to use OCR to scan some
> mathematics (such as mathematics produced using
> the LaTeX typesetting system) and have a LaTeX
> source file output as a result of applying OCR
> software on the scanned in mathematics?

Next to impossible. OCR recognizes the characters, but not their meaning.

Christian Stapfer

unread,
Jul 16, 2009, 1:52:11 PM7/16/09
to
"Dr Engelbert Buxbaum" <engelber...@hotmail.com> schrieb im
Newsbeitrag news:op.uw54g...@bengelbert-dm.rusm.rossu.loc...

> Am 16.07.2009, 05:27 Uhr, schrieb John Goche <johng...@googlemail.com>:
>
>> How difficult would it be to use OCR to scan some
>> mathematics (such as mathematics produced using
>> the LaTeX typesetting system) and have a LaTeX
>> source file output as a result of applying OCR
>> software on the scanned in mathematics?
>
> Next to impossible.

What, then, does InftyReader

http://www.inftyproject.org/en/software.html#InftyReader

do?

>OCR recognizes the characters, but not their meaning.

Well, "meaning" is a somewhat vague term. It is "only" the syntactical
structure
of the mathematical formula, given a raster image of it, that needs to be
recognized. The "meaning" of the formula is, IMHO, something else again.
But, certainly, even recognizing "only" the syntactical structure of a
formula,
given a raster image of it, is a non-trivial problem, to say the least.

Regards,
Christian

Eduardo M KALINOWSKI

unread,
Jul 16, 2009, 5:28:31 PM7/16/09
to
Christian Stapfer wrote:
> "Dr Engelbert Buxbaum" <engelber...@hotmail.com> schrieb im
> Newsbeitrag news:op.uw54g...@bengelbert-dm.rusm.rossu.loc...
>> Am 16.07.2009, 05:27 Uhr, schrieb John Goche <johng...@googlemail.com>:
>>
>>> How difficult would it be to use OCR to scan some
>>> mathematics (such as mathematics produced using
>>> the LaTeX typesetting system) and have a LaTeX
>>> source file output as a result of applying OCR
>>> software on the scanned in mathematics?
>> Next to impossible.
>
> What, then, does InftyReader
>
> http://www.inftyproject.org/en/software.html#InftyReader
>
> do?

Never looked at it, but for my guess, see below.

>> OCR recognizes the characters, but not their meaning.
>
> Well, "meaning" is a somewhat vague term. It is "only" the syntactical
> structure
> of the mathematical formula, given a raster image of it, that needs to be
> recognized. The "meaning" of the formula is, IMHO, something else again.
> But, certainly, even recognizing "only" the syntactical structure of a
> formula,
> given a raster image of it, is a non-trivial problem, to say the least.

It shouldn't be so hard to identify, say, a boldface 'v' and have it
output \mathbf{v} or something like that. I suppose that's what the
InftyReader program does. But it cannot identify that that particular
symbol is referring to a vector, and that many other similar ones are
also vectors.

It cannot also identify common constructions in the particular field
used in the document. To give a simple example, if it sees output
something like $A^T$ it can output that, but it cannot identify that as
a notation for the transpose. So it cannot output something like
$\transp{A}$ (and \transp is a new command defined to output the
transpose of a matrix).

So you should be able to get a visual representation of the formula, but
no semantics. Which means that if you intend to later edit it things
will be somewhat hard, especially if you want to change a notation
(using $A^t$ for all transposes, for example).

Christian Stapfer

unread,
Jul 17, 2009, 12:29:19 AM7/17/09
to
"Eduardo M KALINOWSKI" <edu...@kalinowski.com.br> schrieb im Newsbeitrag
news:h3o629$11h$1...@aioe.org...

> Christian Stapfer wrote:
>> "Dr Engelbert Buxbaum" <engelber...@hotmail.com> schrieb im
>> Newsbeitrag news:op.uw54g...@bengelbert-dm.rusm.rossu.loc...
>>> Am 16.07.2009, 05:27 Uhr, schrieb John Goche
>>> <johng...@googlemail.com>:
>>>
>>>> How difficult would it be to use OCR to scan some
>>>> mathematics (such as mathematics produced using
>>>> the LaTeX typesetting system) and have a LaTeX
>>>> source file output as a result of applying OCR
>>>> software on the scanned in mathematics?
>>> Next to impossible.
>>
>> What, then, does InftyReader
>>
>> http://www.inftyproject.org/en/software.html#InftyReader
>>
>> do?
>
> Never looked at it, but for my guess, see below.
>
>>> OCR recognizes the characters, but not their meaning.

Maybe you are underestimating InftyReader. Why not have a
look at the examples that are found at

http://www.inftyproject.org/en/demo.html#0002

It seems that InftyReader can correctly recognize such things as
sums (greek sigma signs with summation index, lower and upper limit)
and the like: which means that it apparently can recognize some
of the syntactical structure of mathematical formulas.

> It cannot also identify common constructions in the particular field
> used in the document. To give a simple example, if it sees output
> something like $A^T$ it can output that, but it cannot identify that as
> a notation for the transpose.

IMHO you fall into the trap of confusing recognition of syntactical
structure with recognition of "meaning" here: If the program can
recognize T as a superscript character, and correctly typeset it in
LaTeX as such, that is all you can reasonably expect from it.
You cannot expect a mathematician to always grasp "the" (intended)
meaning of some formula (qua syntactical structure) from a mere
scan of the formula either (in some cases, he would have to examine
the surrounding text, maybe from page 1 to page 499 or so...)

> So it cannot output something like
> $\transp{A}$ (and \transp is a new command defined to output the
> transpose of a matrix).

That's agreed (though for those cases, where notation is sufficiently
standardized - which is not always the case among mathematicians,
you know - it would not seem to be a major problem to do some
further "recognition" of such standard "meanings" assigned to
standardized syntax, once the syntactical structure has been recovered
from a mere raster image of the formula). But certainly, the program
does not understand the overall text that assigns "meaning" to
possibly quite "nonstandard syntax".

> So you should be able to get a visual representation of the formula,

I thought that the original poster (John Goche) wanted basically just
that.

> but no semantics.
Well, yes, in a way: and that's just what I wrote. Recognition of
syntax is already a good thing. Recognition of "meaning" from
a mere local[!] scan of a formula, is not even generally possible
for a mathematician.

>Which means that if you intend to later edit it things
> will be somewhat hard, especially if you want to change a notation
> (using $A^t$ for all transposes, for example).

I think you are expecting too much. If InftyReader can deliver what
it claims, it is already doing a great job indeed.

Regards,
Christian

Dr Engelbert Buxbaum

unread,
Jul 17, 2009, 3:32:20 PM7/17/09
to
Am 17.07.2009, 00:29 Uhr, schrieb Christian Stapfer <nob...@nowhere.nil>:

> IMHO you fall into the trap of confusing recognition of syntactical
> structure with recognition of "meaning" here: If the program can
> recognize T as a superscript character, and correctly typeset it in
> LaTeX as such, that is all you can reasonably expect from it.
> You cannot expect a mathematician to always grasp "the" (intended)
> meaning of some formula (qua syntactical structure) from a mere
> scan of the formula either (in some cases, he would have to examine
> the surrounding text, maybe from page 1 to page 499 or so...)

Yes, but a language like LaTeX (or MathML), if correctly applid, can
actually represent that meaning. That is why it is next to impossible to
generate the full meaning of a LaTeX source from a scan of it's output.
You would need some sort of expert system, that replaces the
mathematicians brain.

The same is of course true for text structure. An OCR system may recognise
that a line is written in 14 pt sans serif bold extended, but that this
means "heading of second order" it has no way to grasp. LaTeX represents
that meaning as \section{}, and an experienced human reader of the output
would intuitively grasp it.

zbin...@gmail.com

unread,
Nov 17, 2015, 3:30:44 AM11/17/15
to

hi, OCR can only recognize text from image, but not detect the content. There are two import ocr tool built by google, one is tesseract ocr engine, the other is google docs. both can convert image to text. there is a free online ocr http://www.online-code.net/ocr.html using tesseract.
0 new messages