Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything

by · WIRED

Comment
LoaderSave StorySave this story
Comment
LoaderSave StorySave this story

Following leaked revelations at the end of March that Anthropic had developed a powerful new Claude model, the company formally announced Mythos Preview on Tuesday along with news of an industry consortium it has convened, known as Project Glasswing, to grapple with the cybersecurity implications of the new model and advancing capabilities more generally across the AI field.

The group includes Microsoft, Apple, and Google as well as Amazon Web Services, the Linux Foundation, Cisco, Nvidia, Broadcom, and more than 40 other tech, cybersecurity, critical infrastructure, and financial organizations that will have private access to the model, which is not yet being generally released. The idea, in part, is simply to give the developers of the world's foundational tech platforms time to turn Mythos Preview on their own systems so they can mitigate vulnerabilities and exploit chains that the model develops in simulated attacks. More broadly, Anthropic emphasizes that the purpose of convening the effort is to kickstart urgent exploration of how AI capabilities across the industry are on the precipice, the company says, of upending current software security and digital defense practices around the world.

“The real message is that this is not about the model or Anthropic,” Logan Graham, the company's frontier red team lead, tells WIRED. “We need to prepare now for a world where these capabilities are broadly available in 6, 12, 24 months. Many things would be different about security. Many of the assumptions that we’ve built the modern security paradigms on might break.”

Models developed and trained by multiple companies have increasingly been able to find vulnerabilities in code and propose mitigations—or strategies for exploitation. This creates a next generation of security's classic cat-and-mouse game in which a tool can aid defenders but can also fuel bad actors and make it easier to carry out attacks that were once too expensive or complex to be practical.

“Claude Mythos preview is a particularly big jump,” Anthropic CEO Dario Amodei said on Tuesday in a Project Glasswing launch video. “We haven't trained it specifically to be good at cyber. We trained it to be good at code, but as a side effect of being good at code, it's also good at cyber.” He adds in the video that “more powerful models are going to come from us and from others. And so we do need a plan to respond to this.”

Anthropic's Graham notes that in addition to vulnerability discovery—including producing potential attack chains and proofs of concept—Mythos Preview is capable of more advanced exploit development, penetration testing, endpoint security assessment, hunting for system misconfigurations, and evaluating software binaries without access to its source code.

In carrying out a staggered release of Mythos Preview, beginning with an industry collaboration phase, Graham says that Anthropic sought to draw on tenets of coordinated vulnerability disclosure, the process of giving developers time to patch a bug before it is publicly discussed.

“We've seen Mythos Preview accomplish things that a senior security researcher would be able to accomplish,” Graham says. “This has very big implications then for how capabilities like this should be released. Done not carefully, this could be a meaningfully accelerant for attackers.”

Project Glasswing partners, including some of Anthropic's competitors, struck a collaborative tone in statements as part of the launch.

“Google is pleased to see this cross-industry cybersecurity initiative coming together,” Heather Adkins, Google's vice president of security engineering, says in a statement. “We have long believed that AI poses new challenges and opens new opportunities in cyber defense.”

Those who maintain components of internet infrastructure and firms that develop foundational tech platforms also seem enthusiastic about the collaboration, especially given that Anthropic says use of Mythos Preview has already started to uncover thousands of critical vulnerabilities, including some decades-old bugs that have been repeatedly missed or overlooked in even the most scrutinized code.

“As we enter a phase where cybersecurity is no longer bound by purely human capacity, the opportunity to use AI responsibly to improve security and reduce risk at scale is unprecedented,” Microsoft's global CISO, Igor Tsyganskiy, says in a statement. “Joining Project Glasswing, with access to Claude Mythos Preview, allows us to identify and mitigate risk early and augment our security and development solutions so we can better protect customers and Microsoft.”

Graham says his team at Anthropic, a frontier research group, feels the urgency and the need for global collaboration.

“Probably the most important thing the group needs to do is figure out all the questions that need answers and then figure out the answers,” Graham says. “Project Glasswing is the starting point. It will fail if it’s just a handful of companies using a model. It has to grow into something even larger.”