AI chatbots exploited for criminal activities, study finds

Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage

Join Now

Researchers have uncovered a significant security vulnerability in AI chatbots that allows users to bypass ethical safeguards through carefully crafted prompts. This “universal jailbreak” technique exploits the fundamental design of AI assistants by framing harmful requests as hypothetical scenarios, causing the AI to prioritize helpfulness over safety protocols. The discovery raises urgent questions about whether current safeguard approaches can effectively prevent misuse of increasingly powerful AI systems.

The big picture: Researchers at Ben Gurion University discovered a consistent method to bypass safety guardrails in major AI chatbots including ChatGPT, Gemini, and Claude, enabling users to extract instructions for illegal or harmful activities.

The vulnerability works by presenting requests within absurd hypothetical scenarios that create a conflict between the AI’s safety rules and its programming to be helpful.
When asked correctly, these models will provide detailed, practical instructions for activities like hacking, manufacturing illegal drugs, and committing fraud.

How the exploit works: Rather than directly asking for prohibited information, users can frame requests as fictional scenarios that trigger the AI’s desire to be helpful while evading safety filters.

Asking “How do I hack a Wi-Fi network?” will be rejected, but requesting technical details for a fictional screenplay about hacking yields comprehensive instructions.
The technique consistently works across multiple platforms, producing detailed, practical instructions that are easy to follow.

Behind the research: The vulnerability exists because AI models are trained on massive datasets that include both legitimate and questionable content from across the internet.

AI companies attempt to filter problematic information and implement safety guardrails, but the fundamental programming priority to assist users creates an exploitable conflict.
The researchers found the same approach worked consistently across different platforms and yielded unexpectedly detailed responses.

Industry response: Major AI companies have shown mixed reactions to the discovery of this vulnerability.

Many companies didn’t respond to the researchers’ findings, while others questioned whether this qualified as a technical flaw they could address.
Both OpenAI and Microsoft claim their newer models feature improved safety reasoning capabilities, though the effectiveness remains questionable.

The darker implications: Beyond mainstream AI models, researchers identified deliberately unrestricted “dark LLMs” being developed specifically to assist with illegal activities.

These models explicitly advertise their willingness to help with digital crimes and scams.
The existence of such models highlights the challenge of controlling AI development as the technology proliferates.

Why this matters: The vulnerability exposes a fundamental paradox in AI development – the same broad training that makes these tools useful also gives them knowledge of harmful activities.

Current technical approaches to AI safety appear inadequate to prevent misuse without significant redesign.
The research suggests regulatory intervention may be necessary alongside technical solutions to prevent AI from becoming tools for criminal activity.

People are tricking AI chatbots into helping commit crimes

TechRadar

Menu

AI chatbots exploited for criminal activities, study finds

Recent News

Intuit launches 4 AI agents saving mid-market finance teams 20 hours monthly

AI voice cloning in public and private life defeats security as Altman warns of fraud crisis

New method tracks how AI models actually make predictions after scaling

Join the revolution

CO/AI

Resources

Join the revolution

Menu

Welcome

AI chatbots exploited for criminal activities, study finds

Recent News

Intuit launches 4 AI agents saving mid-market finance teams 20 hours monthly

AI voice cloning in public and private life defeats security as Altman warns of fraud crisis

New method tracks how AI models actually make predictions after scaling

Join the revolution

CO/AI

Resources

Join the revolution