Getty Images
AI-Based “Denoising” in Medical Imaging May Require Further Evaluation
Researchers are calling for objective evaluations of deep learning-based tools to clean up medical images to ensure that these actually improve performance on clinical tasks.
Researchers at Washington University in St. Louis are investigating how deep learning (DL) tools to “denoise,” or clean up, medical images impact the clinical utility of those images.
Conversations around how artificial intelligence (AI) could transform medical imaging have become more common as these technologies have become more advanced in recent years. One of the potential applications of AI in this area is to denoise images, the research team explained.
However, these tools cannot be deployed in the clinical setting without being rigorously tested for clinical utility, which involves evaluating how denoising methods perform on clinically relevant tasks.
To assess one of these tools, the researchers tasked a commonly used DL tool to denoise cardiac single-photon emission computed tomography (SPECT) images. To evaluate the tool’s performance, the research team looked at how visually similar denoised images were to normal images and what the impact of denoising was on the clinically relevant task of detecting a heart defect.
The study revealed that the denoising tool had a tendency to smooth out the cardiac SPECT images, which did reduce noise. However, this smoothing out also led to reduced contrast of the heart defects captured in the images, which help clinicians make accurate diagnoses.
“Rather alarmingly, while the visual-similarity-based metrics suggested that the AI-based denoising technique improved performance, it was actually having no significant impact, and in some cases, it was even degrading performance on clinical tasks,” explained Abhinav Jha, PhD, assistant professor of biomedical engineering at the Washington Univerity McKelvey School of Engineering and of radiology at Mallinckrodt Institute of Radiology (MIR) in the School of Medicine.
“This emphasizes the important need for performing evaluation of AI algorithms on clinical tasks and not just relying on visual similarity as a measure of performance,” he stated.
The researchers pointed out that the tool’s denoising also resulting in reduced contrast is what studies like this one are trying to prevent from happening in the clinical setting.
To this end, the research team recommends that task-based evaluations of AI-based denoising approaches be used to evaluate the clinical utility of the images produced.
“Ensuring AI-based denoising works well for real clinical tasks – not just aesthetically – would mean big benefits for patients by producing high-quality images in less time or with reduced radiation doses,” said study collaborator Robert J. Gropler, MD, professor of radiology and senior vice chair and division director of radiological sciences at MIR.
Moving forward, the researchers are working on developing a new denoising technique that addresses the challenges highlighted in this study.
The shortcomings of the tool showcased in this research underscore the challenges of applying AI to medical imaging.
Last year, experts spoke to HealthITAnalytics to discuss how providers can minimize AI-based image reconstruction risks.
AI-based image reconstruction is associated with some distortion risks, which can lead to inaccurate diagnoses. While the overall risk of this issue is low, leadership from ECRI, an independent nonprofit focused on improving the quality of care, and Stanford Medicine indicated that providers must consider the potential problems that could arise when using AI for image reconstruction, such as using the technologies outside of the bounds within which they were developed and being aware of potential biases in the tool.