When Myles Ingram decided to merge his education in biophysics and data science with his work as a research analyst through Healthcare Innovation Research and Evaluation (HIRE) at Columbia University Irving Medical Center, he activated a transdisciplinary approach that allows him to collect different pieces of information, create a model, quickly test hypotheses related to current events with publicly available data, and build a cohesive narrative.

Ingram, who is a part-time student in the M.S. in Data Science program, recently collaborated with Herbert and Florence Irving Professor of Medicine and Epidemiology and HIRE director Chin Hur and psychology researcher Ashley Zahabian to develop a model that predicted COVID-19 social distancing adherence on a state and county level based on a range of socioeconomic and demographic inputs. Their paper was published in the March 2021 edition of Nature Humanities and Social Sciences Communications; they also won second place in the Data Science Institute’s Center for Health Analytics 2020 COVID-19 Data Challenge.

The idea for the project came to Ingram when he heard a news report claiming that young people do not social distance. He decided to see if there was a way to leverage data science to understand and predict social distancing behaviors, and determine which demographic features correlate to adherence.

The team gathered and analyzed data from a variety of sources, including cell phone location data collected by Unacast, a company providing human mobility insights, as well as census data, voting data, and health data from the Centers for Disease Control and Prevention. They combined statistics from more than 3,000 counties across the U.S., and examined 45 different factors such as age, political orientation, location, race, education, employment rate, housing density, income levels, obesity, and diabetes. 

Based on their analysis, Ingram and his colleagues identified key features that could predict adherence to social distancing with a high level of confidence. In fact, the team’s model predicted county-level social distancing adherence rates with 90.8% accuracy. Their research found that people who work from home were most likely to social distance, while those in dense housing were less likely to do so. Counties with higher per capita income, older people, and more suburban areas were positively associated with adherence, while counties with a higher African American population, high obesity rate, earlier first COVID-19 case / death, and more Republican-leaning residents were negatively correlated with adherence. This type of prediction model may be used to inform health policy planning and potential interventions in areas with lower adherence.

While many of the findings served to substantiate observational assumptions, some, like negative correlation of obesity with social distancing, were a surprise. “I initially thought that people with high obesity would be more likely to social distance, given the elevated health risks, but instead we found evidence of the opposite,” Ingram said.

Originally from Norcross, Georgia, Ingram completed his undergraduate studies at Harvard University in biophysics, an emerging field that applies the theories and methods of physics to understand how bodies and biological systems work. He joined HIRE as a research analyst in 2018 and decided to pursue a master’s degree in data science to expand his skill set and advance his laboratory work.

HIRE uses cutting-edge quantitative methods and techniques to evaluate new health care technologies or therapies, inform clinical and policy decisions, and help prioritize future research. The team specializes in comprehensive analyses using advanced techniques that incorporate data and perspectives from multiple stakeholders, including patients, providers, and health systems. “People get the most innovation by looking at something that’s well known in a certain field using a different lens,” Ingram said.

The HIRE team also leverages other data science approaches (Markov models, simulations, machine learning, and natural language processing) to explore how physicians make medical decisions. Another study published in the Journal of Oncology in February 2020 assessed the effectiveness and cost-effectiveness of biomarker-guided treatment for metastatic gastric cancer. This type of research helps medical providers determine the optimal regimen for care.

Ingram looks forward to applying more of his newly-acquired data science skills to his work, particularly using data visualization to help make the HIRE lab’s findings more accessible to various audiences. He is also considering ways to leverage his expertise for consumer or business applications.

— Karina Alexanyan, Ph.D.