Study Finds AI Feedback Improves Quality of Physician Notes

April 19, 2024
The researchers saw that using an AI tool improved the processes of determining diagnoses and making predictions.

A new study, undertaken by NYU Langone and published April 17 in NJEM Catalyst Innovations in Care Delivery, finds that AI “feedback improved the quality of physician notes written during patient visits, with better documentation improving the ability of care teams to make diagnoses and plan for patients’ future needs.”

NYU Langone has been working to train AI models to “track in dashboards how well doctors’ notes achieved the ‘5 Cs’: completeness, conciseness, contingency planning, correctness, and clinical assessment.” Now, according to this new study, notes improved by AI “resulted in an improvement in care quality across four major medical specialties: internal medicine, pediatrics, general surgery, and the intensive care unit.”

The study found “improvements across the specialties of up to 45 percent in note-based clinical assessments (that is, determining diagnoses) and reasoning (making predictions when diagnoses are unknown). In addition, contingency planning to address patients’ future needs saw improvements of up to 34 percent.”

In addition to these efforts, NYU Langone worked with generative AI chatbots like GPT-4 to “read physician notes and make suggestions.” The case study showed that “large language models [like GPT-4] could provide a method for assessing the 5Cs across medical specialties without specialized training in each.”

A side effect of more widespread electronic health record adoption is that “physician clinical notes are now four times longer on average in the United States than in other countries,” which can make it harder for clinicians to understand diagnoses provided by other clinicians.

Each of the four medical specialties in the study “achieved the institutional goal, which was that more than 75 percent of inpatient history and physical exams and consult notes were being completed using standardized workflows that drove compliance with quality metrics.”