Getty Images

Collecting Genomic Data to Accelerate COVID-19 Drug Discovery

Mount Sinai researchers have developed the COVID-19 Gene Set and Drug Library to leverage genomic data and facilitate collaborative drug discovery.

In the race to find potential treatments and therapies for COVID-19, genomic data is being generated with unprecedented frequency.

For more coronavirus updates, visit our resource page, updated twice daily by Xtelligent Healthcare Media.

The research community is aiming to understand the virus as quickly as possible, resulting in enormous amounts of information – and confusion about what data to prioritize.

“In the search for drugs to treat COVID-19 infection, there is a lot of noise. Hundreds of experimental and computational labs are working on finding and developing drugs that can be used to treat the infection. The results from such efforts are published at a fast rate,” Avi Ma'ayan, Director of Mount Sinai's Center for Bioinformatics, told HealthITAnalytics.

Researchers within Mount Sinai’s LymeMIND research consortium are seeking to improve the COVID-19 drug discovery process. Ma’ayan and his team recently examined tens of thousands of pieces of published research on the virus, including information about its life cycle, molecular mechanisms, and potential therapeutic drug compounds.

The group used the data to develop the COVID-19 Gene Set and Drug Library, which presents drug and gene sets related to COVID-19 in a user-friendly web interface. The database allows researchers to view, download, and analyze virus information, as well as contribute datasets of their own.

With the database, Mount Sinai researchers expect to identify community consensus and make scientists and providers aware of potential, promising new therapies as they become available. The project will help manage the multitudes of data being produced around COVID-19.

“It is hard to tell which efforts are most promising and which candidate drugs and targets should receive the most attention. The project attempts to organize information from these efforts in an unbiased and reusable manner," Ma'ayan said.

The project also aims to facilitate partnerships in the research community, enabling teams to work together to find viable therapies for COVID-19 – an essential part of the drug discovery process, Ma’ayan noted.

“Collaborations are critical. Specifically, collaborations between computational and experimental biologists. Computational biologists can best analyze and synthesize all the accumulated data that is collected from cells, animal models, and patients,” he said. 

“These analyses can inform virologists who have the capacity to test drugs and targets in-vitro and in animal models as potential treatments for COVID-19.”

Researchers across the country have embarked on similar endeavors. Recently, a team from the UC Santa Cruz (UCSC) Genomics Institute received seed funding for its work on the UCSC Genome Browser for COVID-19.

The browser is an interactive online tool that allows researchers to easily access the latest molecular data related to the coronavirus genome. Teams can also use the browser to see which parts of the viral genome other investigators are studying, with annotations providing information about published data on crucial aspects of the virus.

“Researchers are generating molecular data pertaining to the SARS-CoV-2 genome and its proteins at an unprecedented rate during the COVID-19 pandemic,” said Max Haeussler, an assistant research scientist at the UCSC Genomics Institute. “As a result, there is a critical need for rapid and continuously updated access to the latest molecular data in a format in which all data can be quickly cross-referenced and compared.”

Through the COVID-19 Gene Set and Drug Library project, the Mount Sinai team has learned what works and what doesn’t about health crisis response efforts, and what can be improved upon if a similar situation ever arises in the future.

“The response of the scientific research community to the COVID-19 crisis had a lot of energy. Many scientists still work feverishly towards finding treatments and better understanding the virus life cycle,” Ma’ayan said.

“However, most efforts were not well coordinated. Coordinated efforts would have made this response more efficient.”

Ma’ayan also acknowledged the critical role social media played in expanding researchers’ understanding of the virus. The database includes a tab with information collected from Twitter, showing the top drugs tweeted with the term COVID-19 on specific days during the pandemic.

“The other lesson was the importance of social media. This was critical to communicate ideas, results, and data,” he said.

Social media has been a helpful tool in the healthcare industry’s battle against coronavirus. A team from Penn Medicine recently demonstrated how public health officials could use natural language processing techniques to track surges in interest in COVID-19 topics on online forums like Reddit.

Having access to these insights could help inform leaders’ understanding of public health concerns and priorities, and reduce the spread of misinformation.

Going forward, the approach developed by the Mount Sinai team can be used to improve drug discovery and collaboration in relation to other conditions, Ma’ayan said.

“The approach can be applied to other diseases. The web-resource can be repurposed to other projects where related gene and drugs sets need to be aggregated and compared,” he concluded.

Next Steps

Dig Deeper on Artificial intelligence in healthcare