US-based Anthropic claims to have stopped a Chinese state-linked cyber effort that used its AI system to infiltrate global institutions. The company reported that the attackers relied on Claude Code to perform most operations without human intervention.
Anthropic said the September campaign targeted 30 financial and government entities, achieving several successful breaches. The AI system was manipulated by prompting it to act as an employee of a legitimate cybersecurity firm.
The firm stated that this represented a significant shift from earlier AI-assisted cyberattacks. With 80–90% of tasks handled autonomously, the operation was among the first documented large-scale hacks conducted largely by an AI model.
The company noted several instances where Claude malfunctioned, including inventing details about victims or misidentifying open-source data as sensitive material. These inaccuracies tempered the overall effectiveness of the attack.
Reactions have been mixed. Some analysts say the findings demonstrate urgent gaps in AI safeguards, while others argue that Anthropic is framing routine automation as evidence of an emerging threat.