Getty Images
Detecting Artificial Intelligence Algorithm Bias in Cancer Treatment
Poorly trained artificial intelligence can introduce biases into cancer treatment.
Artificial intelligence and deep learning models have become important and powerful tools in cancer treatment. By analyzing digital images or tumor biopsy samples, AI can assist physicians in diagnosing the type of cancer, predicting a prognosis, and developing a course of treatment.
However, if algorithms are not properly trained, they can be inaccurate and biased.
Researchers from the University of Chicago conducted a study showing how the deep learning models being trained on large sets of cancer genetic and tissue histology data can quickly determine the institution submitting the images.
The models that use machine learning methods to “teach” themselves how to recognize specific cancer signatures will use the submitting site as a shortcut to predicting patient outcomes, lumping all patients from the same site together instead of looking at the biology of individual patients.
This algorithm error could lead to bias and missed opportunities for treatment in patients from racial or ethnic minority groups who may already be unrepresented and struggling with care access.
“We identified a glaring hole in the in the current methodology for deep learning model development which makes certain regions and patient populations more susceptible to be included in inaccurate algorithmic predictions," Alexander Pearson, MD, PhD, assistant professor of Medicine at UChicago Medicine and co-senior author, said in a press release.
Cancer treatment often begins with taking a biopsy or small tissue sample from the patient’s tumor. Next, a small slice of the tumor is affixed to a glass slide, which is then stained with multicolored dyes and reviewed by a pathologist to make a diagnosis. Digital images can then be created for remote analysis and storage by using a scanning microscope.
While most of these steps are standard across all labs, some locations have minor variations such as the color or amount of the stain used, how the tissue is processed, and signatures created by the imaging equipment. Although these location-specific signatures might not be visible to the naked eye, the researchers said powerful deep learning algorithms can detect them.
While these algorithms can be a valuable tool for allowing physicians to quickly examine tumors and create treatment options, introducing this potential bias could mean that tumors are not always being measured on the biological signatures.
Pearson and her team analyzed the performance of the deep learning models trained on data from the Cancer Genome Atlas. These models can predict survival rates, gene expression patterns, and mutations by studying tissue histology.
However, the frequency of patient’s characteristics varies widely depending on which institution is submitting the images and the model often defaults to the “easiest” method to tell samples apart- by the submitting site.
The team of researchers discovered that when the model identifies the institution that submitted the images, it will tend to use that information as a stand-in for other characteristics of the image, including ancestry.
"Algorithms are designed to find a signal to differentiate between images, and it does so lazily by identifying the site," Pearson said. "We actually want to understand what biology within a tumor is more likely to predispose resistance to treatment or early metastatic disease, so we have to disentangle that site-specific digital histology signature from the true biological signal."
According to the researchers, the best way to avoid bias is to carefully consider the data used to train the models.
“Developers can make sure that different disease outcomes are distributed evenly across all sites used in the training data, or by isolating a certain site while training or testing the model when the distribution of outcomes is unequal,” the press release stated.
The results will create more accurate tools that can quickly get physicians the necessary information to make a diagnosis and plan treatment for the patients. The more efficient and accurate the models are, the better the care will be for the patient.
“The promise of artificial intelligence is the ability to bring accurate and rapid precision health to more people," Pearson said. "In order to meet the needs of the disenfranchised members of our society, however, we have to be able to develop algorithms which are competent and make relevant predictions for everyone."