Development of tesseract-ocr

53 views
Skip to first unread message

Grzegorz Jablonski

unread,
Jul 6, 2010, 4:28:47 AM7/6/10
to tesseract-ocr
Hello,
I have questions regarding future development of tesseract:
1)When the new release will be available?
2)How long will Google support this projetct?
3)Will be page-layout analysis available in new release?

Grzegorz Jablonski

Jimmy O'Regan

unread,
Jul 6, 2010, 12:48:58 PM7/6/10
to tesser...@googlegroups.com
On 6 July 2010 09:28, Grzegorz Jablonski <jablonski...@gmail.com> wrote:
> Hello,
> I have questions regarding future development of tesseract:
> 1)When the new release will be available?

When it's ready.

> 2)How long will Google support this projetct?

No idea; it's included in Android, and I would speculate based on one
of the Neven patents that it's used by Google Goggles; other
speculation that you might come across include that it's being used in
Google Books and/or in Google Docs' new(-ish) OCR feature. In any
event, it seems likely that Google will continue development
internally for some time to come.

But the *real* answer depends entirely on what exactly you mean by 'support'.

> 3)Will be page-layout analysis available in new release?
>

Yes. If you use Linux or some other Unix-like operating system, you
can check it out from SVN now. If you use Windows, your only option is
to wait.

--
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.

Grzegorz Jablonski

unread,
Jul 7, 2010, 9:56:15 AM7/7/10
to tesseract-ocr


On 6 Lip, 18:48, "Jimmy O'Regan" <jore...@gmail.com> wrote:
> On 6 July 2010 09:28, Grzegorz Jablonski <jablonski.grzeg...@gmail.com> wrote:
>
> > Hello,
> > I have questions regarding future development of tesseract:
> > 1)When the new release will be available?
>
> When it's ready.
>
> > 2)How long will Google support this projetct?
>
> No idea; it's included in Android, and I would speculate based on one
> of the Neven patents that it's used by Google Goggles; other
> speculation that you might come across include that it's being used in
> Google Books and/or in Google Docs' new(-ish) OCR feature. In any
> event, it seems likely that Google will continue development
> internally for some time to come.
>
> But the *real* answer depends entirely on what exactly you mean by 'support'.
I mean developing, adding better modules (page-layout analysis),
improving accuracy
>
> > 3)Will be page-layout analysis available in new release?
>
> Yes. If you use Linux or some other Unix-like operating system, you
> can check it out from SVN now. If you use Windows, your only option is
> to wait.
I managed to compile and run latest SVN version on Windows 7 and MS
Visual Studio 2010 (also 2008)
best
gj

Jimmy O'Regan

unread,
Jul 7, 2010, 12:16:56 PM7/7/10
to tesser...@googlegroups.com
On 7 July 2010 14:56, Grzegorz Jablonski <jablonski...@gmail.com> wrote:
>
>
> On 6 Lip, 18:48, "Jimmy O'Regan" <jore...@gmail.com> wrote:
>> On 6 July 2010 09:28, Grzegorz Jablonski <jablonski.grzeg...@gmail.com> wrote:
>>
>> > Hello,
>> > I have questions regarding future development of tesseract:
>> > 1)When the new release will be available?
>>
>> When it's ready.
>>
>> > 2)How long will Google support this projetct?
>>
>> No idea; it's included in Android, and I would speculate based on one
>> of the Neven patents that it's used by Google Goggles; other
>> speculation that you might come across include that it's being used in
>> Google Books and/or in Google Docs' new(-ish) OCR feature. In any
>> event, it seems likely that Google will continue development
>> internally for some time to come.
>>
>> But the *real* answer depends entirely on what exactly you mean by 'support'.
> I mean developing, adding better modules (page-layout analysis),
> improving accuracy

Ok, I think it's safe to assume that there is still ongoing research
in Google using Tesseract.

>>
>> > 3)Will be page-layout analysis available in new release?
>>
>> Yes. If you use Linux or some other Unix-like operating system, you
>> can check it out from SVN now. If you use Windows, your only option is
>> to wait.
> I managed to compile and run latest SVN version on Windows 7 and MS
> Visual Studio 2010 (also 2008)

The '...and run' part I find suspicious; there's a known bug that
hasn't been fixed yet. I hope to get some time to get to it (and a
Windows machine) at the weekend.

Grzegorz Jablonski

unread,
Jul 8, 2010, 2:48:57 AM7/8/10
to tesseract-ocr


On 7 Lip, 18:16, "Jimmy O'Regan" <jore...@gmail.com> wrote:
> The '...and run' part I find suspicious; there's a known bug that
> hasn't been fixed yet. I hope to get some time to get to it (and a
> Windows machine) at the weekend.
Yeah, I changed one line
bool is_tiff = false;// fileFormatIsTiff(fp);

piotrek_s

unread,
Jul 8, 2010, 3:33:25 AM7/8/10
to tesseract-ocr
confirmed - same line worked for me on windows 7
Reply all
Reply to author
Forward
0 new messages