Getty Images/

Generative AI may bolster digital healthcare software development

New research suggests that generative artificial intelligence could speed the development of health software by improving coding quality and efficiency.

A research team from NYU Langone Health demonstrated that the use of generative artificial intelligence (AI) could accelerate the development of digital health software, according to a study published in JMIR Human Factors.

The researchers noted that the digital transformation of healthcare in recent years has resulted in increasing reliance on software engineering for medical use cases. However, guidance for health researchers detailing how to design digital health interventions is lacking.

Technologies like generative AI have the potential to streamline these efforts by improving the coding process. In the study, the research team explored how the application of ChatGPT would impact the development of diabetes prevention software.           

The software is designed to use text messages to bolster patient engagement and encourage behavior change, such as exercising and eating healthier.

Using ChatGPT, the researchers aimed to recreate the personalized automatic messaging system (PAMS) integrated into the software.

They began by evaluating ChatGPT’s capacity and limitations in the context of “digital product idea conceptualization, intervention content development, and the software engineering process, including software requirement generation, software design, and code production.”

From there, 11 evaluators with expertise in medicine, computer science and other relevant fields were tasked with using ChatGPT to produce a version of the diabetes tool. The evaluators reviewed each of the large language model’s outputs in terms of understandability, usability, relevance, completeness, efficiency and novelty.

ChatGPT received positive scores across most metrics and significantly accelerated the software development process. The evaluators successfully built a version of the diabetes tool using ChatGPT in approximately 40 hours, whereas the software originally took over 200 programmer hours to develop.

“We found that ChatGPT improves communications between technical and nontechnical team members to hasten the design of computational solutions to medical problems,” said study corresponding author Danissa Rodriguez, PhD, MS, assistant professor in the Department of Population Health at NYU Langone and a member of its Healthcare Innovation Bridging Research, Informatics, and Design (HiBRID) Lab, in a news release. “The chatbot drove rapid progress throughout the software development life cycle, from capturing original ideas, to deciding which features to include, to generating the computer code. If this proves to be effective at scale it could revolutionize healthcare software design.”

The research team further underscored that clinicians and nurses are already well-equipped with the knowledge necessary to successfully generate large language model prompts that could help facilitate communication between healthcare providers and the software engineers they are advising when creating a digital health intervention.

“Our study found that ChatGPT can democratize the design of healthcare software by enabling doctors and nurses to drive its creation,” stated senior study author Devin Mann, MD, director of the HiBRID Lab and strategic director of digital health innovation within NYU Langone’s Medical Center Information Technology (MCIT). “GenAI-assisted development promises to deliver computational tools that are usable, reliable, and in line with the highest coding standards.”

This research is one of a growing number of studies assessing how natural language processing (NLP)-based technologies – like large language models – can advance healthcare.

In February, a research team from Pennsylvania State University (PSU) detailed how an NLP framework could significantly enhance the reliability and efficiency of AI-driven medical text summarization tools.

Medical summarization is crucial for generating concise records of clinician-patient interactions, which are used in electronic health records (EHRs), at the point of care and in insurance claims. AI tools can help create these summaries, but concerns about their ability to generate ‘unfaithful’ outputs present a stumbling block for widespread adoption.

To address this, the researchers developed the Faithfulness for Medical Summarization (FaMeSumm) framework, which is designed to help fine-tune existing medical summarization tools by using sets of contrastive summaries and annotated medical terms to address common errors.

Next Steps

Dig Deeper on Artificial intelligence in healthcare