Bard Early College seeks to increase access and success in higher education. The program offers high school students, particularly those from low-income and historically underserved communities, tuition-free, college-level studies in the liberal arts and sciences. The network serves approximately 3,000 students through seven campuses and a growing number of partnerships. 

As a new offering for Fall 2021, Bard introduced a data science course designed and led by Columbia University Data Science Institute (DSI) associate research scholar Susan McGregor.

“I’ve been teaching data science methods at the college level for over 10 years,” said McGregor, who also co-chairs DSI’s Data, Media and Society center. “I see these skills as critical literacies for people living in the 21st century, and they don’t require a lot of prerequisites—it’s more about a way of thinking. So there’s no reason that you couldn’t do this work with younger learners.”

During the semester’s roughly 50 course hours, 22 high school students worked with McGregor and Francesca Loiodice, a Barnard College computer science major and DSI Data for Good Scholar, to learn Python basics, algorithmic analysis, and the fundamentals of machine learning.

To help make data science concepts more concrete, McGregor encouraged the students to work in pairs and research a real-world algorithmic system of their choice. Their selected topics included autonomous vehicles’ driving systems, YouTube and Spotify recommendation algorithms, parole apps, and an algorithmic grading system.

“What is both important and exciting is bringing the mechanistic and the conceptual together,“ McGregor said. “The practical aspect is important because it demystifies the process and empowers students to see what they can accomplish with a few lines of code. It also means that their critiques of a data-driven system can come from an informed place.”

Many Bard Early College students plan to go into STEM fields, but only a small number of students have a sense of the rich and varied terrain beyond medicine and engineering, according to William Hinrichs, dean of academic life.

“A course like Intro to Data Science expands their view before they commit to any particular path of study,” Hinrichs said. “In the class that I observed, students were not just learning about data science, they were doing it.”

For their final projects, the Bard students advanced to generating original questions, forming hypotheses, and conducting expert interviews on ranked-choice voting, the GameStop/Reddit stock market spike, hurricane warning systems in Florida, and pandemic loan programs for Asian restaurants in New York City.

McGregor is actively seeking paths to replicate and adapt the course material for more young people. Ultimately, she hopes her students understand that data science is not just about calculations. “It’s about understanding how to design a question, perform the data collection, do data validation, so that you can truly understand the result of your calculations, and ensure that it accurately reflects the system or phenomenon you’re exploring.”

— Karina Alexanyan, Ph.D.