After watching Dr. Louis' TED talk about ReCaptcha, I reasoned that
the same problem is encountered in Arabic books (State of the Art OCR
is poor, and books written before the last 50 years cannot be OCRed).
However, my understanding is that ReCaptcha only displays words from
Latin books. I was thinking if there was a way where we could modify
ReCaptcha to detect Arabic speakers (which it already does by
detecting locale) and display words from scanned Arabic texts
(Biblotheca Alexandria has the largest corpus of scanned Arabic books)
this would greatly help in digitizing the Arabic language.
What do you guys think? Can anybody help on this?
On May 18, 9:43 pm, بول <
pauljherr...@gmail.com> wrote: