Getty Images

Breast Cancer Detection AI Performs On Par with Human Mammogram Readers

A breast cancer screening AI demonstrated similar sensitivity and specificity to those achieved by 552 human readers of screening mammograms.

The diagnostic performance of a commercially available artificial intelligence (AI) algorithm was comparable with that of human readers of mammograms, according to a study published recently in Radiology.

Screening mammograms are typically leveraged to help detect breast cancer, but the researchers explained this approach has some limitations. Mammography does not always catch every breast cancer, and it can result in false positives, which can lead to patients without cancer undergoing unnecessary imaging or biopsies.

To help address these challenges, two readers are often tasked with interpreting each mammogram to improve the sensitivity and specificity of screening. The research team further indicated that double reading can increase cancer detection rates by six to 15 percent.

However, relying on two readers is labor-intensive, a problem exasperated further by healthcare workforce shortages. The researchers noted that AI has been proposed to tackle this problem, but ensuring that AI tools are safe and effective is critical before they can be widely adopted.

The research team used the Personal Performance in Mammographic Screening (PERFORMS) assessment—a quality assurance framework used by the United Kingdom’s National Health Service Breast Screening Program (NHSBSP) to evaluate the performance of human readers of mammograms—to test one already-available AI.

The study also served to determine whether PERFORMS could effectively assess the performance of an AI algorithm.

Both the AI and 552 readers, made up of 315 board-certified radiologists, 206 radiographers, and 31 breast clinicians, were asked to read mammograms from two PERFORMS test sets, which each consisted of 60 cases with normal, benign, or abnormal findings.

The PERFORMS test sets contained 161 normal breasts, nine benign breasts, and 70 malignant breasts.

The AI and the human readers were assessed based on the sensitivity, specificity, and area under the receiver operating characteristic curve scores associated with their ability to read each mammogram.

Overall, both the AI and the human readers achieved similar levels of high performance.

Human reader performance demonstrated a 90 percent sensitivity and 76 percent specificity, compared to the AI’s 91 percent sensitivity and 77 percent specificity.

Despite these promising results that suggest an AI could perform as well as a human for mammography reading, the researchers cautioned that more research is needed before AI could be leveraged as a second reader in clinical settings.

AI performance can ‘drift’ over time, which could lead to performance shifts that may negatively impact patient outcomes, the research team explained. These algorithms can also be affected by changes in their operating environment, leading the researchers to suggest that further work in AI monitoring be undertaken to ensure that AI can be safely deployed in clinical practice.

Radiology is one of the medical specialties at the center of much AI hype as researchers continue to explore how these technologies may transform medical imaging.

One potential application was detailed last year by Google Health researchers, who showed that one of the company’s deep learning (DL) tools achieves comparative performance to human radiologists when tasked with detecting tuberculosis (TB) via chest radiographs.

TB’s significant global health burden is a major driver of efforts to detect and eliminate the disease. Chest radiograph-guided screening is effective in detecting TB, but lack of chest radiograph interpretation expertise in many parts of the world limits the utility of the approach.

The DL tool is designed to combat this by using chest radiographs to identify active pulmonary TB. When tested, the model performed on par with human radiologists, leading the researchers to conclude that the tool could facilitate TB screening in limited-resource areas.

Next Steps

Dig Deeper on Artificial intelligence in healthcare