Research Suggests Diagnostic Decision Support Systems More Effective at Diagnosing Disease Than LLMs
New research published in JAMA Network Open suggests that diagnostic decision support systems (DHSS) were more effective than generative AI and large language models (LLMs) for diagnosing disease.
Computer scientists at Massachusetts General Hospital (MGH) developed their own DHSS called DXplain in 1984. It “relies on thousands of disease profiles, clinical findings, and data points to generate and rank potential diagnoses for use by clinicians.” Researchers with MGH compared “ChatGPT, Gemini, and DXplain at diagnosing patient cases, revealing that DXplain performed somewhat better, but the LLMs [ChatGPT and Gemini] also performed well. The investigators envision pairing DXplain with an LLM as the optimal way forward, as it would improve both systems and enhance their clinical efficacy.”
Corresponding author Mitchell Feldman wrote that DHSSs “can enhance and expand clinicians’ diagnoses, recalling information that physicians may forget in the heat of the moment.” He also writes that “combining the powerful explanatory capabilities of existing diagnostic systems with the linguistic capabilities of [LLMs] will enable better automated diagnostic decision support and patient outcomes.”
According to the research, all three of DXplain, ChatGPT, and Gemini “listed the correct diagnosis most of the time,” at 72%, 64%, and 58% respectively. Without lab data, “DXplain listed the correct diagnosis 56% of the time, outperforming ChatGPT (42%) and Gemini (39%), though the results were not statistically significant.” Preliminary work building off of these findings “reveals that LLMs could be used to pull clinical findings from narrative text, which could then be plugged into DDSSs.”

Matt MacKenzie | Associate Editor
Matt is Associate Editor for Healthcare Purchasing News.