Machine Learning Tool Knows when to Defer to Human Clinicians

A machine learning system can either make a prediction about a task, or defer the decision to a human expert.

Jessica Kent

Published: 04 Aug 2020

Researchers from MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) have developed a machine learning tool that can adapt when and how often it defers to human experts based on factors such as the expert’s availability and level of experience.

In healthcare, the use of artificial intelligence and machine learning has become more widespread, but experts have continuously cautioned against the use of advanced, autonomous tools in patient care. The industry has taken a hybrid approach to AI, developing tools that can work in tandem with human physicians and help make more informed clinical decisions.

The challenging part of these hybrid approaches is understanding when to rely on the expertise of people versus programs. This isn’t always a question of who does a task better, because if a person has limited bandwidth, the system may have to be trained to minimize how often it asks for help.

CSAIL researchers trained a machine learning system on multiple tasks, including looking at chest X-rays to diagnose specific conditions such as lung collapse and an enlarged heart. In the case of an enlarged heart, the team found that their human-AI hybrid model performed eight percent better than either could on their own.

“In medical environments where doctors don’t have many extra cycles, it’s not the best use of their time to have them look at every single data point from a given patient’s file,” said PhD student Hussein Mozannar, lead author of the study. with David Sontag, the Von Helmholtz Associate Professor of Medical Engineering in the Department of Electrical Engineering and Computer Science.

“In that sort of scenario, it’s important for the system to be especially sensitive to their time and only ask for their help when absolutely necessary.”

The system has two parts: a classifier that can predict a certain subset of tasks, and a rejector that decides whether a given task should be handled by either its own classifier or the human expert. Through experiments on tasks in medical diagnosis and text and image classification, the team showed that their approach not only achieves better accuracy than baselines, but does so with a lower computational cost and far fewer training data samples.

“Our algorithms allow you to optimize for whatever choice you want, whether that’s the specific prediction accuracy or the cost of the expert’s time and effort,” said David Sontag, the Von Helmholtz Associate Professor of Medical Engineering in the Department of Electrical Engineering and Computer Science.

“Moreover, by interpreting the learned rejector, the system provides insights into how experts make decisions, and in which settings AI may be more appropriate, or vice-versa.”

The team has yet to test the system with human experts, and instead developed a series of synthetic experts so that they could tweak parameters such as experience and availability. In order to work with a new expert it’s never seen before, the system would need some minimal onboarding to get trained on the person’s particular strengths and weaknesses.

Going forward, the team plans to test their approach with real human experts, such as radiologists for X-ray diagnosis. Researchers will also explore how to develop systems that can learn from biased expert data, as well as systems that can work with several experts at once. For example, in a hospital setting the machine learning tool could work with different radiologists who are more experienced with different patient populations.

“There are many obstacles that understandably prohibit full automation in clinical settings, including issues of trust and accountability,” said Sontag, who is also a member of MIT’s Institute for Medical Engineering and Science. “We hope that our method will inspire machine learning practitioners to get more creative in integrating real-time human expertise into their algorithms.”

Machine Learning Tool Knows when to Defer to Human Clinicians

A machine learning system can either make a prediction about a task, or defer the decision to a human expert.

Next Steps

Dig Deeper on Artificial intelligence in healthcare

Machine learning framework captures uncertainty in medical images

UK and German governments sign up to greater R&D collaboration

DDN launches all-QLC ExaScaler

Deep-Learning Model Assists Researchers in Obtaining Useable EHR Data