Anthropic Safeguards New AI Model Amid Discovery of Thousands of External Vulnerabilities

Anthropic has recently made waves in the tech world with its groundbreaking AI model, which has unearthed a staggering number of cybersecurity vulnerabilities across various operating systems and web browsers. Instead of releasing this powerful tool to the public, the company opted for a more strategic approach—entrusting it to organizations tasked with maintaining the security of our digital landscape. This initiative, aptly named Project Glasswing, showcases a commitment to fostering a safer internet environment.

What Is Project Glasswing?

At the heart of this initiative lies Claude Mythos Preview, Anthropic’s most advanced AI model. Partners in this major endeavor include industry giants such as Amazon Web Services, Apple, Microsoft, and Google, along with over 40 other organizations dedicated to improving critical software infrastructure.

Anthropic has committed an impressive $100 million in usage credits for Mythos Preview, complemented by a $4 million donation to support open-source security organizations. This proactive funding aims to bolster cybersecurity efforts across the board.

A Model That Surpasses Expectations

Interestingly, Mythos Preview wasn’t specifically engineered for cybersecurity tasks. According to Anthropic, its capabilities naturally evolved from advancements in code reasoning and autonomy. While it enhances the ability to patch vulnerabilities, it also gains proficiency in exploiting them, a paradox that underscores the model’s dual potential.

The model has significantly improved, achieving near saturation of existing security benchmarks. As a result, Anthropic is now turning its attention toward tackling zero-day vulnerabilities—previously unknown flaws that pose substantial risks to software developers and users alike.

Among the notable discoveries made by Mythos Preview: it identified a 27-year-old vulnerability in the OpenBSD operating system, which is renowned for its robust security. In a standout case, it autonomously executed a 17-year-old remote code execution exploit in FreeBSD, demonstrating its extraordinary capabilities without any human intervention after the original instruction to seek vulnerabilities.

The Reasons Behind Limited Release

Newton Cheng, the Frontier Red Team Cyber Lead at Anthropic, has been clear about the implications of releasing Claude Mythos Preview broadly. He stated, “We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities.” The rapid evolution of AI poses significant risks, including potential misuse by malicious actors, which could have dire consequences for economies and public safety.

This concern isn’t theoretical; Anthropic has previously uncovered a real-world incident where a state-sponsored group employed AI to conduct cyberattacks, autonomously infiltrating multiple targets. The intelligence community is now closely monitoring how models like Mythos Preview could reshape cyber operations, both offensively and defensively.

Addressing Open-Source Challenges

One of the critical components of Project Glasswing focuses on open-source software. As Jim Zemlin, CEO of the Linux Foundation, stated, “Security expertise has historically been a luxury reserved for organizations with large security teams.” Open-source maintainers—a group that plays a pivotal role in developing critical infrastructure—often struggle with security concerns.

In response, Anthropic has generously contributed $2.5 million to Alpha-Omega and the OpenSSF via the Linux Foundation, as well as $1.5 million to the Apache Software Foundation. These funds provide essential resources for maintainers, giving them access to cutting-edge AI-driven vulnerability scanning capabilities that were previously out of reach.

Looking Ahead: What’s Next?

Anthropic has indicated that its ultimate goal is to deploy Mythos-class models on a larger scale—but only when appropriate safeguards are firmly in place. The company plans to first implement new safety measures with its upcoming Claude Opus model. This cautious approach allows them to fine-tune safeguards with a model that doesn’t present the same level of risk as Mythos Preview.

The competitive landscape in AI is already shifting. With OpenAI unveiling GPT-5.3-Codex, also classified as high-capability for cybersecurity, it’s clear that managing AI’s capabilities will be critical moving forward. Anthropic’s approach with Project Glasswing signals a new paradigm—favoring controlled deployments over unrestricted releases as the standard for advanced models.

The future remains uncertain, and as the capabilities of these AI models proliferate, questions about their release and usage will undoubtedly continue. For now, the focus lies on maintaining security while fostering innovation.

In a world where technology evolves rapidly, we owe it to ourselves and future generations to prioritize safety without stifling progress. Stay informed and proactive in adopting the latest cybersecurity measures to safeguard our digital environments. Together, we can make the internet a safer place for everyone.