atum@Tencent % ls tags
atum@Tencent % ls -l | grep software-security
In an internal red team test before Claude Opus 4.6's release, Anthropic's Frontier Red Team did something brutally simple: they put Opus 4.6 in a sandbox environment, gave it Python and a set of standard vulnerability analysis tools, provided no specialized instructions or domain knowledge injection, and let it mine open-source repositories for vulnerabilities on its own. The result: over 500 previously unknown high-severity zero-day vulnerabilities. This number has led many security practitioners to joke, half-seriously, that "we're about to be replaced by AI." This topic deserves a serious conversation.
Simply by deploying Clawdbot and conversing with it, your computer could be completely controlled by attackers. This is a fundamental architectural problem, not a bug—it's a "feature". This article systematically analyzes the root causes of this risk, the conditions required for a successful attack, and why existing defenses can only mitigate, but never eliminate, the threat.
By 2025, our systems had automatically uncovered more than 60 real-world vulnerabilities. Half of them are high-risk vulnerabilities. Looking back, we found that **our success came not from a single technical breakthrough, but from correctly tracking paradigm shifts in AI and adapting our methods at each transition**. At the same time, we observed many top-tier papers gradually losing real-world impact as they failed to adapt to those shifts. This article is our attempt to make that pattern explicit: we trace three paradigm transitions in automated vulnerability discovery from 2022 to 2025—moving from "LLMs as classifiers" to "LLMs augmenting fuzzers and static analyzers" to "agentic, tool-using auditors"—and discuss how understanding these shifts can help you make research and engineering bets that survive across paradigms.
Our AI-powered automated vulnerability discovery engine has uncovered more than 30 vulnerabilities across various types of important open-source software, nearly half of which pose significant real-world risks (such as RCE). In this article, we’ll share one particularly interesting case: a high-severity vulnerability (CVE-2025-57801, CVSS 8.6) discovered in the zero-knowledge proof library gnark. We’ll also be sharing more intriguing vulnerabilities in the future.