Arabic reCAPTCHA

928 views
Skip to first unread message

RobT

unread,
May 18, 2012, 3:28:21 PM5/18/12
to reCAPTCHA
We've had a few instances where users are reporting that the CAPTCHAs
that appear are in Arabic. Is this possible?

I have seen screenshots and the characters are very similar to those
within the Arabic alphabet.I know the user can simply hit refresh to
get a new reCAPTCHA character combination, but we were just wondering
if this was possible.

Thanks,
Rob

بول

unread,
May 18, 2012, 3:36:06 PM5/18/12
to reca...@googlegroups.com


On Fri, May 18, 2012 at 8:28 PM, RobT <rob....@gmail.com> wrote:
I know the user can simply hit refresh to
get a new reCAPTCHA character combination, but we were just wondering
if this was possible.


Yes, it's possible. Users are not expected to transcribe them however. (Ditto greek or equations, which we've also had questions about.)
 
If they enter (only) the other word (which is presumed to be the 'known word') correctly, they'll pass the captcha.
 
If this is a recurrant problem with your site, rather than the 0.0000001% of your users, I'd suggest adding/amending wording to the form where you require captchas to indicate this fact.
 
--
PJH


Ariel Baez

unread,
May 18, 2012, 3:38:03 PM5/18/12
to reca...@googlegroups.com
Well, more important, what is the system expecting as a result! I have seen some words with accent characters, which I just type as the non accented equivalent.

I once saw pua. With the a rotated 180ccw. Clearly the word is "and", I of course typed PUA and it passed. (KINDA LIKE 7-UP === dnL ===)

But I have seen situations where the word is upside down, the question is do you enter the word?  The letters you see, even if upside down. At this point do you take the letters left to right or right to left.

Very Interesting indeed. I guess its up to the scoring algorithm to decide which is correct.

Ariel


--
You received this message because you are subscribed to the Google Groups "reCAPTCHA" group.
To post to this group, send email to reca...@googlegroups.com.
To unsubscribe from this group, send email to recaptcha+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/recaptcha?hl=en.


بول

unread,
May 18, 2012, 3:43:48 PM5/18/12
to reca...@googlegroups.com
On Fri, May 18, 2012 at 8:38 PM, Ariel Baez <ariel...@gmail.com> wrote:
Well, more important, what is the system expecting as a result! I have seen some words with accent characters, which I just type as the non accented equivalent.

I once saw pua. With the a rotated 180ccw. Clearly the word is "and", I of course typed PUA and it passed. (KINDA LIKE 7-UP === dnL ===)

But I have seen situations where the word is upside down, the question is do you enter the word?  The letters you see, even if upside down. At this point do you take the letters left to right or right to left.

Very Interesting indeed. I guess its up to the scoring algorithm to decide which is correct.
 
 
The one you'd have troubling entering on a 'normal US keyboard' without, ability to add - e.g. - diacriticals, is likely to be the untested word.
 
I *think* untested words that have a high correlation of 'good' responses (i.e. lots of people thought *blob* was readable as 'fred' as opposed to being refused or had different answrs) may go into the pool of known words.
 
--
PJH

Ahmed Fathalla

unread,
May 20, 2012, 8:55:39 AM5/20/12
to reCAPTCHA
After watching Dr. Louis' TED talk about ReCaptcha, I reasoned that
the same problem is encountered in Arabic books (State of the Art OCR
is poor, and books written before the last 50 years cannot be OCRed).
However, my understanding is that ReCaptcha only displays words from
Latin books. I was thinking if there was a way where we could modify
ReCaptcha to detect Arabic speakers (which it already does by
detecting locale) and display words from scanned Arabic texts
(Biblotheca Alexandria has the largest corpus of scanned Arabic books)
this would greatly help in digitizing the Arabic language.

What do you guys think? Can anybody help on this?

On May 18, 9:43 pm, بول <pauljherr...@gmail.com> wrote:

Ahmed Abdelali

unread,
Jul 11, 2012, 4:22:37 PM7/11/12
to reca...@googlegroups.com
That will be great, as you mentioned not just Biblotheca Alexandria but all over the world including the Library of the Congress contains manuscripts that will be very valuable for everyone not just Arabic readers, once in readable format, it can be translated to other languages.
Reply all
Reply to author
Forward
0 new messages