I'm not a robot: AI Agents and the end of Captchas

If it wasn't strange enough that humans have to prove on a daily basis that they are not robots , it certainly is that robots are now able to pass those tests better than us.

The latest episode, in this sort of reverse version of Blade Runner , emerged on Reddit a few days ago, when a user posted screenshots showing how OpenAI's new ChatGPT Agent had easily overcome the most common Captcha Code , the one that requires us to click on “ I'm not a robot ”.

What is ChatGPT Agent?

ChatGPT Agent is a new system that allows OpenAI's artificial assistant to control its own browser to access, navigate and perform operations on the web on our behalf (for example, shopping online or buying a train ticket), while we simply supervise the entire process and intervene directly only in the most important phase, for example at the time of payment.

While surfing the web, ChatGPT Agent inevitably encountered captchas.

But instead of asking the human user to fix them, he did it himself, commenting: "The link has been inserted, so now I'll click the 'verify I'm a human' button to complete the verification on Cloudflare. This process is necessary to prove I'm not a bot and continue with the action."

The bot that thinks it's human

The irony of a bot encountering an obstacle designed specifically to block it and consciously deciding, if you will, to bypass it was captured by Reddit users, who commented: “ After all, it was trained with human data, why should it identify itself as a bot ?”

The paradox that, by solving captchas so often, we humans were training machines to solve them has always been, as we will see, one of the limitations of these tests .

Officially known as the Completely Automated Public Turing Test to Tell Computers and Humans Apart, captchas have a very specific purpose: to prevent certain sites – especially those offering services – from being inundated with requests from automated programs that can perform the same operation over and over again .

Making sure there is a human on the other side of the screen helps prevent the internet from being flooded with spam , scam attempts, and more.

The complexity of captchas

In recent years, captchas have often been divided into two levels. The first and simplest is the "I'm not a robot" checkbox. As Ars Technica explains , this box analyzes multiple signals—including mouse movements, the time it takes to click , the browser fingerprint that allows a user to be identified, and more—to determine whether the clicker is human or not.

If this first test is not passed, we move on to the second – and hated – level: the one in which we are asked to identify traffic lights, cars, pedestrian crossings or other objects within a series of images.

These visual quizzes may be more difficult, but back in 2016, researchers at Columbia University demonstrated that they could solve 70% of captchas using image recognition tools.

An inevitable paradox

It's inevitable, considering what happens after we solve the captchas: the data relating to our actions to correctly identify traffic lights or other features is used to train artificial intelligence systems , which thus become better at recognizing images, but also, consequently, at solving the captchas . It's a paradox that has always been at the heart of these quizzes and which likely allowed the ChatGPT agent to simulate human movements when asked to confirm that he was not a robot.

This isn't even the only paradox in this field, as more and more click farms are hiring squads of humans to beat captcha codes en masse, effectively turning these people into bots and thus cheating the system once again.

The future beyond captchas

So, is it time to abandon those hated captchas forever?

Currently, the direction being followed is often the opposite, and involves creating increasingly complex quizzes : some require users to recognize sounds (originally intended for blind people) or listen to melodies and indicate where the sequence of notes repeats; in other cases, users are asked to rotate a 3D image of an animal until its muzzle points in the indicated direction.

Even these more advanced captchas are no longer enough to stop artificial intelligence.

Speaking to MIT Tech Review , computer engineering professor Mauro Migliardi explained that since AI can now be trained to solve any cognitive challenge, it is necessary to use “physical captchas,” which require, for example, rotating the phone when browsing via smartphone.

But can we really be forced to manipulate our phone like a gamepad every time we want to access a website? A simpler solution is being tested by a coalition of companies (including Apple, Google, and Cloudflare) called Privacy Pass , which locally stores, in encrypted and anonymous form, previous tests passed to prove we're human, without having to repeat the process each time . This solution is certainly preferable to captchas, but according to an analysis conducted by Mozilla, it's not without its drawbacks, in terms of privacy risks and more.

Bot armies can't be stopped (completely)

A perfect solution, at the moment, doesn't exist. This is demonstrated by the fact that bot-generated traffic now accounts for the majority of online traffic , and that even the most complex captchas are now designed to make the deployment of bot armies more expensive and complex, not to stop them entirely.

In the future, the situation risks becoming even more complex: if and when we hand over the keys to our online experience to AI Agents, do we really want to have to physically intervene every single time they encounter increasingly complex tests, preventing them from solving them on their own? One thing is certain: the time has come to invent new solutions and leave captcha codes behind. No one will miss them.

La Repubblica