How AI is getting better at finding security holes

NPR | By Huo Jingnan

Published April 11, 2026 at 5:00 AM EDT

AI lab Anthropic announced this week that it had developed a powerful new model the company believes could "reshape cybersecurity." It said that its latest model, Mythos Preview, was able to find "high-severity vulnerabilities, including some in every major operating system and web browser."

In the past few months, AI models have gone from producing hallucinations to becoming effective at finding security flaws in software, according to developers who maintain widely used cyber infrastructure. Those pieces of software, among other things, power operating systems and transfer data for things connected to the internet.

While these new capabilities can help developers make software more secure, they can also be weaponized by hackers and nation states to steal information and money or disrupt critical services.

The latest development of AI's cyber capability came on Tuesday, when AI lab Anthropic announced it had developed a powerful new model the company believes could "reshape cybersecurity." It said that its latest model, Mythos Preview, was able to find "high-severity vulnerabilities, including some in every major operating system and web browser." Not only that, the model was better at coming up with ways to exploit the vulnerabilities it found, which means malicious actors can more effectively achieve their goals.

For now, the company is limiting the access to the model to around 50 select companies and organizations "in an effort to secure the world's most critical software." They're calling the collaboration Project Glasswing, naming it after a butterfly species with transparent wings.

Anthropic says the risk for misuse is so high that it has no plans to release this particular model to the general public, according to the announcement, but it will release other related models. "Our eventual goal is to enable our users to safely deploy Mythos-class models at scale," the company wrote.

Still security experts say the potential risks are for cybersecurity professionals and not everyday people. "I don't necessarily think that the average computer user needs to be fundamentally worried about this," said Daniel Blackford, VP of Threat Research at Proofpoint, a cybersecurity firm. "They need to be way more worried about not giving their password away because that just happens like all day, every day."

Mythos Preview may also have better capabilities to help developers fix vulnerabilities than previous models, said Jim Zemlin, CEO of the Linux Foundation, which hosts the open source Linux kernel. A kernel is an interface that lets hardware talk to software, and the Linux kernel powers some of the most widely used operating systems including Android, as well as all of the world's 500 most powerful supercomputers. The foundation is a part of Project Glasswing, and Zemlin said a core group of people who work on the Linux kernel, known as maintainers, have started experimenting with the new model to figure out the most effective way to use it.

"These maintainers are already overworked before AI," Zemlin said. "This just makes their lives a lot better."

Even before Mythos Preview's limited release, the cybersecurity community has already been grappling with how capable the most advanced commercially available AI models have become. Developers are racing to patch security vulnerabilities that hackers flag with AI. Security experts are worried about what might lie ahead if such capacities proliferate without guardrails.

Hackers are using AI to help them find bugs. It wasn't working, until now

Improvement in AI models' capabilities became noticeable early 2026, said Daniel Stenberg. He is the lead software developer behind cURL, a 30-year-old open-source data transfer tool that's widely used for things that connect to the internet, including cars and medical devices. That change followed the releases of new cutting-edge models late 2025.

Like many in the software community, Stenberg gets help from security researchers to keep his software functional and secure. So-called "white hat" hackers flag security flaws to developers in private and in return, they sometimes get reward money, known as "bug bounties," or bragging rights by getting the security flaw named after themselves. While all bugs affect the functionality of the software, only some bugs present security vulnerabilities.

But that process started to break down last year. Stenberg's team was inundated by bogus reports he was convinced were generated by AI. "We got 185 reports throughout the year and less than 5% of them were actually security related problems in the end," Stenberg said.

Even as the report volume doubled from 2024 to 2025, Stenberg ended up finding and fixing fewer security vulnerabilities. The slop reports got so bad that Stenberg stopped paying out bug bounty rewards for his cURL tool.

The hackers that file reports to him are usually anonymous, said Stenberg, and they don't discuss whether the reports were made with AI. But Stenberg said AI has a distinct writing style. "They tend to be very elaborate and descriptive … You get a 400 line report [when] it's something that a human would take 50 lines to present."

HackerOne, a platform that Stenberg uses to manage security reporting, surveyed hackers in the summer of 2025 and found that nearly 60% of respondents were either using AI, learning it, or learning to audit AI or machine learning systems.

"LLMs have now bypassed human capability for bug finding"

Things have changed dramatically this year. The volume of reports has been even higher than in 2025, but so far Stenberg said, most of them have uncovered legitimate issues. "Almost all the bad [reports] are now gone."

He estimates that about 1 in 10 of the reports are security vulnerabilities, the rest are mostly real bugs. Just three months into 2026, the cURL team Stenberg leads has found and fixed more vulnerabilities than each of the previous two years.

Aside from reports by external security researchers, Stenberg also uses AI to find vulnerabilities himself. With one click, AI has flagged over 100 bugs in his code that has gone through rounds of review by humans and traditional code analyzers "in almost magical ways."

Stenberg's experience is not unique. Maintainers of the Linux kernel have seen a similar change in the quality of bug reports. Nicholas Carlini, an Anthropic research scientist, was able to find vulnerabilities in the Linux kernel using an older Anthropic model and a relatively simple prompt. Carlini also used AI to find the first critical vulnerability in another 20-year-old open source project.

"LLMs have now bypassed human capability for bug finding," said Alex Stamos, chief security officer at Corridor, an AI software security company. The increase in quality of the security research reports followed the release of Anthropic's model Opus 4.5 in November, said Stamos, who was previously the head of security at Yahoo and Facebook. Since so much commercial software has open-source components, what happens to open-source projects has wider implications for the internet, Stamos said.

Can AI patch the bugs and security vulnerabilities it finds?

Stenberg appreciates that AI models currently available to everyone have become more helpful in finding bugs, but he's also wary of what future, more powerful models might bring for developers who maintain open-source software. "It's an overload of all the maintainers who are already often overloaded and understaffed and underpaid and underfunded in many ways."

Stenberg isn't part of Project Glasswing and says that a lot of critical projects, "things that are actually cornerstones of the Internet," have been left out.

Anthropic did not respond to NPR's request for comment.

But based on his experience so far, Stenberg says AI is not as good at fixing bugs and security flaws as it is at finding them.

Part of the reason is that neither the bugs nor their fixes are cut-and-dry. Like many other aspects of software development, making the judgement calls takes more time than actually writing the code. "Once we have identified the problem and agree that this is a problem, then actually fixing it is not very hard and not very time consuming. It's more the entire process up to that step that takes time and energy," Stenberg said.

Others think differently. A company called HackerOne is now developing an agentic AI product to more autonomously find— and mend— vulnerabilities.

How does AI factor into the tug of war between offense and defense in cybersecurity?

"Finding bugs is not making the bugs exploitable," Stamos said. "The first part of what we call the kill chain is discovering the flaws. The next step is… [actually] building the weapon. And the foundation models will not do that for you."

Foundation models are models made by the most advanced AI labs like Anthropic, OpenAI and Google Deepmind. Those labs have put guardrails in place to prevent their models from creating software that can be used maliciously. These models are highly proprietary and their inner workings aren't public.

But all bets are off once so-called open-weight models that are more accessible to the public catch up to those from the most advanced AI labs. Bad actors can make a copy of them and remove the guardrails against malware.

"Then we're in real trouble because you would be able to ask those models to not just find the bugs, but then to create exploit code" that can hack into systems, Stamos said. The most advanced open-weight models are less than a year behind the most advanced closed-weight models.

"It also shows how incredibly stupid it is for the Pentagon to say that Anthropic is a risk to the entire United States," Stamos added. The Pentagon has labelled Anthropic a "supply chain risk" because the company asked the government to not use its technology for autonomous weapons and mass surveillance. The label would bar government agencies and contractors from working with Anthropic. Anthropic is currently disputing the label in court.

Most of the open-weight models are made by companies based in China, which the U.S. considers its main rival in the race towards AI dominance. By not releasing Mythos Preview publicly, Stamos said, Anthropic is giving software developers as well as the U.S. time to shore up their defenses.