Current News

  • Tweet

Race “Unknown”: Missing Information Skews COVID-19 Statistics

Posted on Tuesday, March 9, 2021

A year into the COVID-19 pandemic, and patient race/ethnicity is still largely unaccounted for in the data of many states and cities across the nation.

ARCS Scholar Katie Labgold says Atlanta is no different. Labgold, an Emory University PhD student in epidemiology, is currently immersed in a study that proposes a new method for addressing the challenge of missing race/ethnicity information in COVID-19 case reports. With more complete data, researchers will be able to better understand who is most affected by the COVID-19 virus—and why.

Labgold and a team of colleagues partnered with Atlanta’s Fulton County Board of Health to analyze the completeness of COVID-19 data related to hospitalization and fatality rates in the county. Labgold says that when they started their research, about one-third of the reports did not include race or ethnicity information.

If a patient report does not include racial or ethnic data, the form is excluded from the county’s overall statistical research for COVID-19, according to Labgold. Her team’s question: Do these excluded reports bias what researchers understand about the pandemic’s burden on certain communities? To find the answer, the Emory group used an epidemiologic method known as quantitative bias analysis to compute COVID-19 notifications, hospitalizations, and fatality rates by race/ethnicity to account for the missing data.

Quantitative bias analysis allows scientists to estimate nonrandom errors in epidemiologic studies, such as limited sample size, predisposed selection type, or misinformation, including measurement errors or misclassification.

“We used the information of a patient’s surname and residence to predict the person’s race or ethnicity. We then corrected some of the data because we knew the prediction would possibly misclassify people into the wrong race or ethnic categories,” Labgold explains.

The results were shocking. When information on race/ethnicity was included, the racial/ethnic disparity (specifically, the difference in COVID-19 notification rates for cases of Black, Hispanic, and other races/ethnicities compared to White cases) was 30 percent to 60 percent greater than the disparity observed when cases missing race/ ethnicity data were excluded from the analysis. Labgold says the analysis not only confirmed the presence of disparities but also revealed the discrepancies were greater when the missing race/ethnicity information was considered.

“Ultimately, this study further motivates the need for accurate and complete data collection of race and ethnicity,” Labgold states. “We need to support labs and public health departments in collecting this information at the time of testing.”

This support will require investing in upgraded data systems and improved training for staff, Labgold says. It will also necessitate increasing the number of public health workers at collection sites to analyze data, identify gaps in racial/ethnic details, and then quickly contact labs and clinics to gather the incomplete information, she adds.

This type of epidemiologic research is not new to Labgold. As a doctoral fellow at the Center for Reproductive Health Research in the Southeast (RISE) at Emory’s Rollins School of Public Health, she is using similar methods to improve the surveillance of maternal and reproductive health outcomes. These studies could potentially find ways to incorporate historical context into epidemiologic research and identify sociopolitical determinants of health disparities for reproductive health outcomes.


Katie Labgold, ARCS Foundation



Katie Labgold, ARCS Foundation