Google Recaptcha V3 Algorithm

0 views
Skip to first unread message

Paul

unread,
Aug 5, 2024, 7:33:40 AM8/5/24
to rooftingstapring
Asper this answer, (assuming a similar implementation), at first "recaptcha" generates a hidden key and attaches it to a hidden input element and also lazily renders a check box (not an actual check box input but a div) with the same key which when clicked, sends an asynchronous request (XHR) to the Google backend servers to mark it as a valid verification key (i.e. a key that has to be validated when the form is submitted).

I would assume it looks at how you behaved prior to clicking, how your cursor moved on its way to the check (organic path/acceleration), which part of the checkbox was clicked (random places, or dead on center every time), browser fingerprint, Google cookies & contents, click location history tied to your fingerprint or account if it detects one etc.


It's fairly difficult to fake "organic" behavior in such a way that it would fool a continuously learning pattern detection engine. In the cases where it's not sure, it still prompts you to match an actual CAPTCHA string.


Another interesting finding is that Google runs a VM in JavaScript that obfuscates much of reCAPTCHA code and behavior. This VM is known as botguard and is used to protect other services besides reCAPTCHA:


Google is introducing reCAPTCHA v3, which looks like a "human score prediction engine" that is calibrated per website. It can be installed into different pages of a website (working like a Google Analytics script) to help reCAPTCHA and the website owner to understand the behaviour of humans vs. bots before filling a reCAPTCHA.


Say, the POST data (solved CAPTCHA) has a field called fingerprint, a string calculated from user behavior. I think there may be a field about that check box location. I guess this check box is in a coordinate system randomly generated by Google back-end and encrypted by the public key of my site. So, a robot may "guess/calculate" a location about this box, but when site owner makes the GET query with private key to verify user identity, Google will decrypt the coordinate system and say if the user click on the right place. So, only one possible right click(with some offsets, it's a square box) location in this random coordinate system owned by only Google and site owners.


Will Google make this algorithm available for public or will maintain in-house use only? If its available for public, then it will be very useful for small time programmers, industrial automation (Machine Vision systems), Autonomous cars to make dynamic decisions such as road diversions etc., digitizing personnel dairies / notes or even bunch of snaps from my favorite books are readable text. So please consider making it available for general public.


Considering my 90% success rate on reCAPTCHA, that means computers may have a higher chance to solve it than I do.



Not trying to point out anything, but shouldn't humans actually have a higher chance than computers?


We are having performance issues with reCaptcha integration, because Marketo is not able to reliably validate a response within 2 minutes. There is a suggestion to proxy the reCaptcha results in persistent database here: -Discussions/How-to-call-reCaptcha-Webhook-in-less-than-2-minut... but the web team would prefer not to set up a persistent database.


3. Marketo would call the validation web service using a webhook after a form was submitted and blacklist any leads for which the response returned success = false. The timestamp would be used for a Change Data Value trigger.


I think the disadvantages are manageable - the spam leads we were getting were not using a browser, but directly posting to the forms endpoint. Those leads would be easily weeded out by this mechanism.


Hi Sanford, appreciate the feedback. I agree it is forgeable, but not trivially. For starters, it will require somebody (a JavaScript knowledgeable human being) to spend some time and effort - a barrier most hackers that post SPAM leads to Marketo instances will not bother to overcome. The sha256 hash is just an example, a more complex algorithm could be used. Finally, the website could generate changing seeds for the signature that a hacker would not be able to forge.


My understanding of the spammer game is as follows: They submit to random forms on the web and monitor replies. If they find an auto-response with personalized content, they just replicate the traffic in an automated way, bypassing the browser. They do not inspect the website code - the return on such effort would not be worth it. This is the attack we are trying to block.


reCAPTCHA is a free Google service that protects websites from spam and abuse by distinguishing human users from automated bots. Using machine learning and advanced risk analysis, it is a more advanced version of the traditional CAPTCHA system.


The Turing test is a method to determine whether or not computers can exhibit human-like behavior. This kind of behavior is examined by reCAPTCHA and is often employed to prevent abuse of sign-up, contact forms, or comment sections.


There are multiple types of CAPTCHA tests available, from using a real-life image or a simple checkbox. This article will share how the different types of tests work and how to install this kind of test on your site.


CAPTCHAs are all automated so that the computer program can grade the test without the interference of humans. Due to this reason, the tests are constantly evolving as both the CAPTCHA AI and malicious bots become more advanced.


The verification process of traditional CAPTCHAs works by forcing users to solve tests before allowing access. The CAPTCHA tests use random letters and numbers, warping them in a way that is hard for automated programs to translate. Previously, this has been a sufficient deterrence method, as bots would have difficulty recognizing these distorted letters or numbers.


However, more advanced bots have been developed, with the ability to quickly solve traditional CAPTCHAs with algorithms trained in pattern recognition. The traditional CAPTCHAs then were replaced with more complex tests in the form of reCAPTCHA v1.


These reCAPTCHA tests used a computer-generated word and distorted text via images from old books or news articles. However, this version is no longer available as it was found to be too easy for the bots and too hard for human users.


The image recognition reCAPTCHA test uses either nine or 16 lower resolution real-life images in the form of a square. Above these images, users will find displayed instructions on which image sections should be selected. For example, the instruction might ask users to select all squares featuring crosswalks or fire hydrants.


This test is also available in an audio version, which makes it accessible to visually impaired users. The audio test will vocalize random letters and numbers using distorted audio, prompting users to answer using text input.


This test distinguishes humans from bots by following the cursor movement as it approaches the checkbox. Even a human user with the most stable hand will display some randomness in cursor movement, even on a microscopic level. A bot, typically, will not be able to mimic this kind of movement, preferring to act in a straight line.


Installing reCAPTCHA can be done in different ways, either manually or by using a WordPress plugin. Before installing the test, there are a couple of things to consider as well, such as the type and location of the test.


There are different types of reCAPTCHA tests available. Select which type works best for your site. We suggest you consider your visitors and what kind of test would be best for their user experience.


Then, think about where you would like to add the test. reCAPTCHA services are often available next to online forms, such as sign-up or contact pages. Knowing the location of the test beforehand will help with the installation process.


After you fill out the form, click on the Submit button. Google will generate a site key and a secret key. Use the site key in the HTML code of your site and the secret key for communication between your site and reCAPTCHA.


If you use WordPress, installing reCAPTCHA is simple using a plugin. First, manually install a WordPress plugin for reCAPTCHA. While a couple of plugin options are available to add the test, there is no official reCAPTCHA plugin.


The Contact Form 7 plugin has the option to integrate reCAPTCHA protection on all of its forms. To do so, head to the Dashboard -> Contact -> Integration after you install and activate the plugin. Under the reCAPTCHA section, click on the Setup integration button.


Then, head to Dashboard -> Contact -> Add New to add the necessary information for the form. Add a title in the Enter title here section to differentiate between the forms.


Copy the shortcode and head to the WordPress editor to add the form field you have created. On the Gutenberg editor, simply paste the shortcode, and the form will be automatically added, integrating your reCAPTCHA test as a result.


Next time you want to post a comment on a blog or use an online contact form, chances are you'll be confronted by a puzzle asking you to read some blurry and distorted text. Known as CAPTCHA, theses challenges are supposed to only be solvable by humans, in order to prevent unwanted bots from using web services.


According to the company, the algorithm can now accurately recognise 90 percent of street numbers, meaning Google Maps users looking for a particular building are likely to get a more specific result.


But, given the nature of that challenge, it turns out that the algorithm is also well-suited to solving CAPTCHA puzzles designed to fox spammers using bots for services like Gmail. As Google's engineers explain in a recently published paper, the algorithm has 99.8 percent accuracy rate when trying to decipher the hardest puzzles created by Google's own CAPTCHA service, reCAPTCHA.

3a8082e126
Reply all
Reply to author
Forward
0 new messages