Brazilian Portuguese data

464 views
Skip to first unread message

thadeu.penna

unread,
Aug 30, 2007, 4:05:56 PM8/30/07
to tesseract-ocr
Hi,

I have created a package with Brazilian Portuguese data. I do not know
if it is the best approach, but I have 300 words in freq-dawg and
260,000 words in the full dictionary. Is it a appropriated ratio ?

As soon as I get information in how to get the job done as good as
possible, I would like to send it to the official distribution. How
could I proceed ?

P.S.: the post about Tesseract-OCR in my blog, in portuguese, is by
far the most read post. I just would like to say thanks by giving
something back to the project.

thera...@gmail.com

unread,
Aug 30, 2007, 5:00:51 PM8/30/07
to tesseract-ocr
That sounds great!

If you just send the data to me, I will add it to the downloads.
Thanks,
Ray.

JJeffman

unread,
Sep 4, 2007, 10:45:39 AM9/4/07
to tesseract-ocr
Please let me know as soon as the Portuguese language is available.

Cheers, Jayme

Thadeu Penna

unread,
Sep 4, 2007, 11:19:56 AM9/4/07
to tesser...@googlegroups.com
2007/9/4, JJeffman <jjef...@gmail.com>:

>
> Please let me know as soon as the Portuguese language is available.
>
> Cheers, Jayme

You can get a preliminary version in
http://profs.if.uff.br/tjpp/blog/entradas/ocr-de-qualidade-no-linux
(the package and instructions on how to I build it are described). I
assume you can read Portuguese :)

--
Thadeu Penna
Prof.Associado - Instituto de Física
Universidade Federal Fluminense
http://profs.if.uff.br/tjpp/blog

withbl...@gmail.com

unread,
Sep 4, 2007, 12:50:15 PM9/4/07
to tesser...@googlegroups.com
you can develop OCR for portuguese language with help of tesseractOCR
Refer http://code.google.com/p/tesseract-ocr/w/list

Ray

unread,
Sep 5, 2007, 12:01:41 PM9/5/07
to tesseract-ocr
Brazilian Portuguese is now on the downloads page.
Reply all
Reply to author
Forward
0 new messages