© 2026 Connecticut Public

FCC Public Inspection Files:
WEDH · WEDN · WEDW · WEDY
WEDW-FM · WNPR · WPKT · WRLI-FM
Public Files Contact · ATSC 3.0 FAQ
Play Live Radio
Next Up:
0:00
0:00
0:00 0:00
Available On Air Stations

In real-world test, an AI model did better than ER doctors at diagnosing patients

Researchers tested an AI model against ER doctors and found the model outperformed the humans.
shapecharge/E+
/
Getty Images
Researchers tested an AI model against ER doctors and found the model outperformed the humans.

A patient shows up at the hospital with a pulmonary embolism — a blood clot that has traveled to the lungs. After initially improving, their symptoms start to worsen. The medical team suspects the medication isn't working.

In steps artificial intelligence — with its own theory.

It has scanned the medical records and suspects a history of lupus, an autoimmune condition which can lead to heart inflammation, could explain what was really ailing the patient.

Turns out, the AI model is correct.

This type of scenario could become a reality in the-not-too-distant future, according to a study published Thursday in the journal Science.

Researchers based at Harvard Medical School and Beth Israel Deaconess Medical Center found that an AI reasoning model, developed by OpenAI, excelled at diagnosing patients and making decisions about managing their care. It matched and often outperformed doctors and the earlier AI model, GPT-4.

The researchers ran a series of experiments on the AI model to test its clinical acumen — including actual cases like the lupus patient who'd been previously treated at the emergency department at Beth Israel in Boston.

The team graded how well the AI model could provide an accurate diagnosis at three moments in time, from the triage stage in the ER, up to being admitted into the hospital.

Overall, AI outperformed two experienced physicians — and did so with only the electronic health records and the limited information that had been available to the physicians at the time.

"This is the big conclusion for me — it works with the messy real-world data of the emergency department, " said Dr. Adam Rodman, a clinical researcher at Beth Israel and one of the study authors. "It works for making diagnoses in the real world."

Other parts of the study focused on case reports published in the New England Journal of Medicine and clinical vignettes to suss out whether the AI model could meet well-established "benchmarks" and game out thorny diagnostic questions.

"The model outperformed our very large physician baseline," said Raj Manrai, assistant professor of Biomedical Informatics at Harvard Medical School who was also part of the study.

The authors emphasize the AI relied on text alone, while in real life, clinicians need to attend to many other inputs like images, sounds and nonverbal cues when diagnosing and treating a patient.

Still, the work showcases just how far the technology has advanced in the last few years. Prior versions of large language models faltered when dealing with uncertainty, and in generating a list of possible conditions that could explain symptoms, what's known as a differential diagnosis.

"This paper is a beautiful summary of just how much things have improved," says Dr. David Reich, chief clinical officer for Mount Sinai Health System in New York, who was not involved in the work.

"You have something which is quite accurate, possibly ready for prime time," he says. "Now the open question is how the heck do you introduce it into clinical workflows in ways that actually improve care?"

After all, arriving at some tricky, final diagnosis — which the AI model shines at — isn't necessarily reflective of how things play out "in real clinical medicine," says Reich, where the "outcomes are much more subtle and perhaps more diverse."

And the emergency department is only a small portion of the patient's total medical care. Rodman acknowledges it's unlikely AI would have done such an "impressive" job had the team provided it with the records of someone who'd spent a month in the hospital.

None of those involved in the new study believe the findings support supplanting doctors with AI, "despite what some companies are likely to say and how they're likely to use these results," says Manrai.

"I think it does mean that we're witnessing a really profound change in technology that will reshape medicine," he adds.

But the results do make the case that AI models need to be tested in a rigorous fashion, ideally through forward-looking trials that can give more certainty about how the technology ultimately impacts clinical practice.

"It's a very challenging process to design these trials," says Reich, "but this study is a perfect call to action."

Copyright 2026 NPR

Will Stone
[Copyright 2024 NPR]

Federal funding is gone.

Congress has eliminated all funding for public media.

That means $2.1 million per year that Connecticut Public relied on to deliver you news, information, and entertainment programs you enjoyed is gone.

The future of public media is in your hands.

All donations are appreciated, but we ask in this moment you consider starting a monthly gift as a Sustainer to help replace what’s been lost.

SOMOS CONNECTICUT is an initiative from Connecticut Public, the state’s local NPR and PBS station, to elevate Latino stories and expand programming that uplifts and informs our Latino communities. Visit CTPublic.org/latino for more stories and resources. For updates, sign up for the SOMOS CONNECTICUT newsletter at ctpublic.org/newsletters.

SOMOS CONNECTICUT es una iniciativa de Connecticut Public, la emisora local de NPR y PBS del estado, que busca elevar nuestras historias latinas y expandir programación que alza y informa nuestras comunidades latinas locales. Visita CTPublic.org/latino para más reportajes y recursos. Para noticias, suscríbase a nuestro boletín informativo en ctpublic.org/newsletters.

Federal funding is gone.

Congress has eliminated all funding for public media.

That means $2.1 million per year that Connecticut Public relied on to deliver you news, information, and entertainment programs you enjoyed is gone.

The future of public media is in your hands.

All donations are appreciated, but we ask in this moment you consider starting a monthly gift as a Sustainer to help replace what’s been lost.

Related Content