When Columbia University graduate student Jordan Siff was recruited to help design and develop an interactive data science tool through the Data for Good Scholars program, he jumped at the opportunity to work on a “real-world” project.

“Class projects often overlook crucial aspects of the data science workflow, such as data wrangling and identifying or accounting for inconsistencies in the data collection methods,” said Siff, who studies statistics and holds a bachelor’s degree in computer science from Duke University.

Through the Data for Good Scholars program, which leverages data science to address a variety of social issues, including human rights, education, utilities, and structural projects, Siff collaborated with Data Science Institute (DSI) associate research scientist Ipek Ensari, University of Pittsburgh associate professor of science education Cassie Quigley, and ThroughlinesEDU founder Aileen Owens on the Citizen Science Interactive (CSI) dashboard.

CSI teaches fundamental data science skills and research methods through interactive and relatable individual and group-based learning. Students become “citizen scientists” by engaging with visualizations and research questions related to natural disasters through data and figures organized by topic⁠—FEMA funding, blizzards, flash floods, heat, hurricanes, tornadoes, wildfires, sea level rise, etc.⁠—and by type⁠—bar graphs, line graphs, box plots, scatterplots, and maps.

This collaboration began when Quigley and Owens received a Grable Foundation grant to integrate environmental justice into elementary and middle grade curricula. They hoped to build an educational dashboard similar to Ensari’s previous work on Puppy Scientist, an interactive data science tool for children that aggregates open source data on dog prevalence in New York City and bite incidents by breed, age, gender, day of the week, and more.

CSI features modules with varying difficulty levels for different skill levels. “It was important that the tool had a low floor, high ceiling so that students and teachers could use it in multiple ways and spaces to promote growth in learning,” Quigley explained.

The tool will launch across underserved school districts in midwestern and southern U.S. states next year, according to Owens.

DSI has connected more than 65 data science student volunteers like Siff to real-world projects like CSI since the Data for Good Scholars program began in Spring 2019. The DSI research staff co-mentors most projects, and stipends are provided for selected students to serve as project coordinators alongside nonprofit organizations, government agencies, and professors both in and outside Columbia University.  

Ensari, who also serves as program director, has found that these collaborations provide a “win-win” opportunity for partner organizations and students. “For organizations, the program offers invaluable assistance with a specific problem, as well as an education in how to apply data science to advance their work,” she said. “For students, the real-world problems and actual data sets offer excellent training in working with the reality of messy data outside a classroom setting.”

For Siff, another valuable component of the experience is the opportunity to work with people of various backgrounds and expertise. “Working with interdisciplinary collaborators presented the exciting challenge of communicating complex statistical concepts in less esoteric terms and the unique opportunity to utilize others’ experiences and knowledge to create an improved application that meets the needs of the end-user,” he said. “I was ecstatic to be able to do all of this in the context of promoting environmental justice and equipping the next generation of students with the increasingly important tool of data science literacy, two causes that I care about deeply.”

Ensari considers such student enthusiasm one of the program’s primary metrics for success. “Seeing their drive and inspiration to pursue further research or other endeavors afterwards helps us know we’re doing something right.”

— Karina Alexanyan, Ph.D.