Critical Considerations Required for COVID-19 Data Analysis

Decision-makers should consider key criteria when analyzing COVID-19 data, including representativeness and time range.

When evaluating COVID-19 data to inform policies, decision-makers should consider five criteria to better understand the spread of the virus in their communities: representativeness, potential systematic over- or under-estimation, uncertainty, time range, and geographical area, according to a report by the National Academies of Sciences, Engineering, and Medicine.

For more coronavirus updates, visit our resource page, updated twice daily by Xtelligent Healthcare Media.

While more data is now available about how COVID-19 is impacting the country, this information often comes in various forms and is not always complete, the National Academies’ newly formed Societal Expert Action Network stated in the report.

Having an enhanced understanding of this data can help inform decisions on critical issues dependent on those indicators, such as lifting social distancing measures and reopening businesses.

“Our intent is not to discourage decision-makers from using any of these data, as they represent the best of what is available,” said Mary Bassett, co-chair of SEAN’s executive committee and director of the François-Xavier Bagnoud Center for Health and Human Rights at Harvard University.

“Rather, the goal of our rapid expert consultation is to clarify the limitations of these data points and help leaders as they make decisions, such as when to allow public gatherings or reopen businesses.”

The team said that several types of data on the extent and spread of COVID-19 are being used to inform decision-making, including number of confirmed cases, hospitalizations, emergency department visits, reported confirmed COVID-19 deaths, excess deaths, fraction of viral tests that are positive, and representative prevalence surveys.

The utility of this data for decision-making is dependent on multiple factors, the researchers noted, such as the burden of collecting, cleaning, and interpreting the data across sources. Additionally, data models and collection tend to improve over time, so their assessment will also need to be updated regularly.

Decision-makers should consider the representativeness of the data, the team said, and determine whether the reporting population represents the population of interest. Policymakers should also consider whether each person in the population has an equal chance of being measured.

Leaders should also measure the potential inaccuracy of COVID-19 data, the team said. If there are small sample sizes, if subjects have been tested twice, or if tests produce inaccurate results, the data may not be as reliable as it should.

In addition, lawmakers should be aware that there is often a time lag between the occurrence of an indicator and its reporting.

“Data tend to become more complete over time, so that counts must generally be revised (e.g., deaths on weekends are often reported on the next working day). A second problem is that data on deaths, for example, reflect infections that occurred some time ago and thus need to be interpreted in that context,” the team stated.

Policymakers should consider over- or under-estimation of particular data elements as well.

“Given two indicators, one of which may result from systematic overestimation and the other from systematic underestimation, it is good to use both to guide a decision,” the group said.

“For example, the proportion of positive tests in a sample of people with active symptoms will be an overestimate of the true prevalence of disease in the population, while the number of confirmed cases as a proportion of the population will likely be an underestimate.”

Going forward, these considerations will be critical as the country moves to subsequent phases of reopening and the pandemic.

“This is our network’s first official response, and we’re well-positioned to address other questions from governors, mayors, city councils, and other leaders grappling with how to respond to the outbreak,” said SEAN executive committee co-chair Robert Groves, executive vice president and provost at Georgetown University.

“Social science is uniquely poised to help weigh risks, understand the causes and consequences of people’s behavior, and guide informed decisions forced to be taken under uncertainty.”

Next Steps

Dig Deeper on Health data governance