Getty Images

Radiologist, Machine Learning Combo Enhances Breast Cancer Screening

Combining machine learning algorithms with evaluations from radiologists could improve the accuracy of breast cancer screenings.

Machine learning tools could help enhance the accuracy of breast cancer screenings when combined with assessments from human radiologists, a study published in JAMA Network Open revealed.

Mammography screening is one of the most commonly used tools for early breast cancer detection, researchers noted, and multiple clinical trials have shown that the approach can decrease mortality.

However, the group also stated that mammography screening isn’t always perfectly accurate. Roughly nine to ten percent of the 40 million US women who undergo breast cancer screening each year are recalled for additional diagnostic imaging, and only four to five percent of these women are ultimately diagnosed with breast cancer. This can lead to patient anxiety and unnecessary treatment or interventions.

The team aimed to improve breast cancer screening accuracy by combining evaluations from radiologists with machine learning algorithms. The study was based on results from the Dialogue on Reverse Engineering Assessment and Methods (DREAM) initiative, a crowd-sourced competition that sought to assess whether AI algorithms could beat or meet radiologist interpretive accuracy.  

Conducted by investigators at IBM Research, Sage Bionetworks, Kaiser Permanente Washington Health Research Institute, and the University of Washington School of Medicine, the DREAM initiative brought together a broad, international scientific community and tested multiple algorithms for improved breast cancer detection.

The team trained and validated the algorithms using hundreds of thousands of screening mammograms from women in the US. To train the algorithms, researchers used images alone or combined images, previous examinations, and clinical and demographic risk factor data that would translate to cancer yes or no within 12 months.

The top-performing algorithm achieved an area under the curve of 0.858 and a specificity of 66.2 percent, lower than community-practice radiologists’ specificity of 90.5 percent. When researchers combined the top-performing algorithm with US radiologist assessments yielded a higher area under the curve of 0.942 and resulted in a specificity of 92.0 percent.

Instead of demonstrating AI’s potential to take over radiologists’ jobs completely, the results support the idea that advanced analytics tools can augment care delivery and diagnostics.

“Based on our findings, adding AI to radiologists’ interpretation could potentially prevent hundreds of thousands of unnecessary diagnostic workups each year in the United States. Robust clinical validation is necessary, however, before any AI algorithm can be adopted broadly,” said Dr. Christoph Lee, professor of radiology at the University of Washington School of Medicine and co-first author of the paper.

As AI continues to seep into clinical settings, more and more healthcare professionals are coming to view the technology as valuable support tools. A recent survey conducted by MIT Technology Review and GE Healthcare revealed that 79 percent of providers believe AI tools have helped reduce clinician burnout, allowing professionals to deliver more patient-centered, engaging care.

“Healthcare institutions have been anticipating the impact that artificial intelligence (AI) will have on the performance and efficiency of their operations and their workforces—and the quality of patient care,” the survey said.

“Contrary to common, yet unproven, fears that machines will replace human workers, AI technologies in healthcare may actually be ‘re-humanizing’ healthcare, just as the system itself shifts to value-based care models that may favor the outcome patients receive instead of the number of patients seen.”

A separate study from NYU showed that combining AI with analysis from human radiologists significantly improved breast cancer detection, helping radiologists reduce the number of biopsies needed.

“Our study found that AI identified cancer-related patterns in the data that radiologists could not, and vice versa,” said senior study author Krzysztof J. Geras, PhD, assistant professor in the Department of Radiology at NYU Langone.

“AI detected pixel-level changes in tissue invisible to the human eye, while humans used forms of reasoning not available to AI. The ultimate goal of our work is to augment, not replace, human radiologists,” 

The study based on the DREAM initiative indicated that combining radiologists and AI algorithms could reduce mammography recall rates from 0.095 to 0.08, an absolute 1.5 percent reduction. This means that of the approximately 40 million US women who are screened for breast cancer each year, more than half a million would not have to undergo unnecessary diagnostic work-up.

“While no single AI algorithm outperformed US community radiologist benchmarks, an ensemble of AI algorithms combined with single-radiologist assessment was associated with an improved overall mammography performance,” researchers said.

According to the research group, the DREAM challenge representing the largest objective AI benchmarking effort in screening mammography interpretation to date. Investigators are making the algorithms freely available to the larger research community for use and assessment in future studies.

“To our knowledge, this was the first study in AI and mammography benchmarking requiring teams to submit their algorithms to the challenge organizers, which permitted the evaluation of their algorithms in an unbiased and fully reproducible manner,” the team concluded.

“We believe this to be an important new paradigm for data sharing and cloud-based AI algorithm development, allowing highly sensitive and restricted data such as screening mammograms to be used for public research and AI algorithm assessment.”

Next Steps

Dig Deeper on Artificial intelligence in healthcare