Chinnapong - stock.adobe.com
Deep Learning Technique Could Improve Cancer Diagnostics
A new two-step technique to train deep learning algorithms could enhance researchers’ understanding of cancer diagnostics and biology.
A team from the Lawrence J. Ellison Institute for Transformative Medicine of USC have developed a technique to train a deep learning algorithm for improved cancer diagnostics.
In a study published in Scientific Reports, researchers describe using novel tissue fingerprints of tumors paired with correct diagnoses to facilitate deep learning in the classification of breast cancer.
Developing artificial intelligence algorithms for cancer diagnostics is a challenging task, the team noted. These tools need clinically annotated data from tens of thousands of patients to analyze before they can recognize meaningful relationships in the data with consistency. In cancer pathology, an ideal size dataset is nearly impossible to collect. Researchers typically only have access to hundreds or low thousands of pathology slides annotated with correct diagnoses.
To overcome this challenge, the group developed a two-step process of priming the algorithm to identify unique patterns in cancerous tissue before teaching it the correct diagnoses.
“If you train a computer to reproduce what a person knows how to do, it’s never going to get far beyond human performance,” said lead author Rishi Rawat, PhD.
“But if you train it on a task ten times harder than anything a person could do you give it a chance to go beyond human capability. With tissue fingerprinting, we can train a computer to look through thousands of tumor images and recognize the visual features to identify an individual tumor. Through training, we have essentially evolved a computer eye that’s optimized to look at cancer patterns.”
The first step in the process introduces the idea of tissue fingerprints, or distinguishing architectural patterns in a tumor’s tissue, that an algorithm can use to discriminate between samples because no two patient’s tumors are identical.
The results showed that AI algorithms detected these structural differentiations on pathology slides with greater accuracy and reliability than the human eye, and recognized these variations without human guidance.
For the study, the researchers took digital pathology images and split them in half, then prompted a deep learning algorithm to pair them back together based on their molecular fingerprints. This demonstrated the algorithm’s ability to group “same” and “different” pathology slides without paired diagnoses, which allowed the team to train the algorithm on large, unannotated datasets – a technique known as self-supervised learning.
After training the algorithm to identify breast cancer tissue structure that distinguishes patients, researchers implemented the second step for the deep learning tool: Learning which of those known patterns correlated to a particular diagnosis.
The discovery training set of 939 cases obtained from the Cancer Genome Atlas enabled the algorithm to accurately assign diagnostic categories of breast cancer to images.
This new technique presents a new paradigm for medical machine learning, and could allow future deep learning algorithms to process unannotated or unlabeled tissue specimens, as well as variably processed tissue samples, to aid pathologists in cancer diagnosis.
“With clinically annotated pathology data in short supply, we must use it wisely when building classifiers,” said corresponding author Dan Ruderman, PhD, director of analytics and machine learning at the Ellison Institute.
“Our work leveraged abundant unannotated data to find a reduced set of tumor features that can represent unique biology. Building classifiers upon the biology that these features represent enables us to efficiently focus the precious annotated data on clinical aspects.”