Getty Images
Generative AI predicts hospital admissions from ED visits
GPT-4 outperformed traditional machine learning models when tasked with forecasting which emergency room patients may need to be admitted to the hospital.
Researchers from Mount Sinai found that large language model (LLM) GPT-4 can accurately predict whether a patient in the emergency department (ED) is likely to require hospitalization, according to a study published this month in the Journal of the American Medical Informatics Association.
The research team indicated that AI and LLMs have the potential to improve ED operations by supporting clinical decision-making around patient admission. However, the researchers noted that there is little research to date that incorporates real-world data into an LLM for this purpose.
To bridge this gap, the team pulled structured and unstructured EHR data from over 864,000 ED visits across seven Mount Sinai Health System hospitals.
This information was used to train Bio-Clinical-BERT, XGBoost and an ensemble model, alongside GPT-4. Each was tasked with predicting hospital admissions in a variety of scenarios.
“We were motivated by the need to test whether generative AI, specifically [LLMs] like GPT-4, could improve our ability to predict admissions in high-volume settings such as the emergency department,” said co-senior author Eyal Klang, MD, Director of the Generative AI Research Program in the Division of Data-Driven and Digital Medicine (D3M) at Icahn Mount Sinai, in a news release. “Our goal is to enhance clinical decision-making through this technology. We were surprised by how well GPT-4 adapted to the ER setting and provided reasoning for its decisions. This capability of explaining its rationale sets it apart from traditional models and opens up new avenues for AI in medical decision-making.”
The ensemble model achieved an area under the receiver operating characteristic curve (AUC) of 0.88, an area under the precision-recall curve (AUPRC) of 0.72 and an accuracy of 82.9%.
A “naïve” version of GPT-4 achieved 0.79 AUC, 0.48 AUPRC and 77.5% accuracy. However, when given access to a limited set of clinical data, the model’s performance significantly improved, reaching 0.87 AUC, 0.71 AUPRC and 83.1% accuracy.
The research team underscored that this jump in performance is thanks to the LLM’s ability to effectively learn from small samples and incorporate traditional machine learning predictions.
“Our research suggests that AI could soon support doctors in emergency rooms by making quick, informed decisions about patient admissions. This work opens the door for further innovation in health care AI, encouraging the development of models that can reason and learn from limited data, like human experts do,” explained co-senior author Girish N. Nadkarni, MD, MPH, Irene and Dr. Arthur M. Fishberg Professor of Medicine at Icahn Mount Sinai, Director of The Charles Bronfman Institute of Personalized Medicine and System Chief of D3M. “However, while the results are encouraging, the technology is still in a supportive role, enhancing the decision-making process by providing additional insights, not taking over the human component of health care, which remains critical.”
This research is part of larger efforts to explore generative AI’s potential in healthcare.
“Our study informs how LLMs can be integrated into health care operations. The ability to rapidly train LLMs highlights their potential to provide valuable insights even in complex environments like healthcare,” noted co-author and emergency room physician Brendan Carr, MD, MA, MS, Chief Executive Officer of Mount Sinai Health System. “Our study sets the stage for further research on AI integration in health care across the many domains of diagnostic, treatment, operational, and administrative tasks that require continuous optimization.”