Nobi_Prizue/istock via Getty Ima
COVID-19 Severity Determined by Machine Learning, Predictive Analytics
Researchers used machine learning models and predictive analytics to study risk factors tied to severe cases of COVID-19.
A recent JAMA study highlights risk factors associated with the severity of COVID-19 in individuals using machine learning models and predictive analytics. By studying COVID-19 severity and risk factors over time, providers can use artificial intelligence technology to predict the clinical severity and provide better care for patients.
As the COVID-19 pandemic spread across the world, scientists searched for methods to treat the virus. While other countries such as the United Kingdom and Denmark conducted person-level analytical analyses across their populations to determine the best forms for care delivery, medication decisions, and national interventions, the United States lacked the capacity to do so, according to researchers.
“A large, multicenter, representative clinical data set has been desperately needed by US practitioners, scientists, health care systems, and policymakers to develop predictive and diagnostic computational tools and to inform critical decisions,” the authors wrote.
In order to address the gaps in data, the National COVID Cohort Collaboration (N3C) formed to accelerate the understanding COVID-19 and create a novel approach for collaborative data sharing and analytical data during the pandemic.
The N3C is made up of members from the National Institutes of Health Clinical and Translational Science Awards Program and its Center for Data to Health, the IDeA Centers for Translational Research, the National Patient-Centered Clinical Research Network, the Observational Health Data Sciences, and Informatics network, TriNetX, and the Accrual to Clinical Trials network.
“This report provides a detailed clinical description of the largest cohort of US COVID-19 cases and representative controls to date. This cohort is racially and ethnically diverse and geographically distributed. We evaluated COVID-19 severity and associated clinical and demographic factors over time and used machine learning to develop a clinically useful model that accurately predicts severity using data from the first day of hospital admission,” the study stated.
The research consisted of a retrospective cohort study of 1,926,526 US adults infected with COVID-19 and adult patients without the virus, to act as controls, from 34 medical centers across the nation between January 1, 2020, and December 7, 2020.
Patients were stratified using the World Health Organizations COVID-19 severity scale and demographic data and characteristics. Differences between the groups were tracked and evaluated using multivariable logistic regressions.
“Random forest and XGBoost models were used to predict severe clinical course (death, discharge to hospice, invasive ventilatory support, or extracorporeal membrane oxygenation),” the study stated.
The cohorts included 174,568 adults who tested positive for COVID-19 and a control group of 1,133,848 adults that tested negative. Of the adults that tested positive, 18.6 percent of them were hospitalized and 20.2 percent of those hospitalized had a severe clinical course.
The mortality rate of hospitalized patients was 11.6 percent overall and decreased from 16.4 percent in March to April 2020 to 8.6 percent in September to October 2020.
“Using 64 inputs available on the first hospital day, this study predicted a severe clinical course using random forest and XGBoost models (area under the receiver operating curve = 0.87 for both) that were stable over time. The factor most strongly associated with clinical severity was pH; this result was consistent across machine learning methods,” the study explained.
According to the research gathered in the study, the authors concluded that COVID-19 mortality decreased during 2020 and patient demographic characteristics and comorbidities were tied to higher clinical severity. The machine learning models were able to accurately predict the clinical severity using the commonly collected clinical data from the first 24 hours of a patient’s hospitalization.