Columbia University data scientists from a range of academic fields took the stage at Data Science Day, the Data Science Institute’s flagship event gathering industry leaders and innovators together to learn about Columbia’s research at the forefront of data science across every field, profession, and sector.

At Data Science Day 2024, remarks from Columbia University President Minouche Shafik underscored the university’s commitment to data science and artificial intelligence and a keynote by Prabhakar Raghavan, a Senior Vice President of Google, offered insight into the evolution of search engines. Three rounds of faculty lightning talks and a poster and demo session rounded out the day hosted by Clifford S. Stein, Interim Director of the Data Science Institute and Wai T. Chang Professor of Industrial Engineering and Operations Research and Professor of Computer Science. 

President Minouche Shafik Opens Data Science Day

Columbia University President Minouche Shafik framed the event within her broader priorities for the future of Columbia University, which include leadership in the field of Artificial Intelligence.

 “Universities have always been incredible, sometimes even magical places where students and scholars come together to create unique environments of learning and research, and we have a distinct perspective to bring at this historical moment when the potential of AI seems so great and yet its future shape is still to be formed. And it’s with that evolution in mind that Columbia’s Data Science Institute was founded more than a decade ago, and I am certain that it will play an enormous role in shaping what is to come in this incredibly important field.”

The Future of Search in the age of AI

In the keynote speech, “Beyond information retrieval: What does Search mean these days?,” Prabhakar Raghavan, who is a Senior Vice President at Google as well as an accomplished scholar, reviewed advances in search technology over the decades and how Google has matured far beyond simple information retrieval. He discussed work to better respond to user intent, how Google maintains commitment to open information while giving appropriate weight to traditionally authoritative sources, and efforts to better serve speakers of underrepresented languages online. 

Raghavan also touched upon experimental use of AI and large language models to respond to complex queries, noting that Google has integrated AI into its products for over a decade. He highlighted a pilot project using AI translation to make content accessible in more languages.

Closing his speech, Raghavan reflected on hearing a keynote decades earlier that posed the question, If you had all the computing cycles in the world, would it be possible to build a better search engine? Decades later, Raghavan says that with accelerating technology and computational power, Google is making great strides to help people on the never-ending quest for information.   

The Pursuit of Fairness in Data Science

From selecting amongst equally accurate models for prediction tasks, to training learning-based algorithms on subjective clinical diagnostic data, to setting e-commerce prices based on consumer valuation, algorithmic models can raise thorny questions about the very meaning of fairness and how to address it. Data scientists Adam Elmachtoub, Associate Professor of Industrial Engineering, Shalmali Joshi, Assistant Professor of Biomedical Informatics at Vagelos College of Physicians and Surgeons, Emily Black,Assistant Professor of Computer Science at Barnard College, and Katja Maria Vogt, Professor of Philosophy, explored where innovation confronts ethics and philosophy.

Read the Recap: What is Fairness?

Computational Imaging Unlocks New Visual Frontiers


Making the Most of an Image delved into the full life cycle of data science. In this session, moderated by Yading Yuan, Herbert and Florence Associate Professor of Radiation Oncology at the Columbia University Medical Center,  Shree K. Nayar, T.C. Chang Professor of Computer Science, Arian Maleki, Assistant Professor of Statistics, and Hortense Fong, assistant professor of marketing at Columbia Business School, explored the frontiers of computational imaging; offered insight on fundamental challenges in coherent imaging systems; shared new advances in radiomics; and discuss research that pairs imagery with music to evoke emotion and stimulate action.

Session Recap: Making the Most of an Image

Sustainability Solutions Require Innovation and Policy Support

To develop and implement climate change solutions at scale, innovation, policy, and profitability are required. This session, moderated by Kate Ascher, Paul Milstein Professor of Professional Practice of Urban Development, explored the role of patents, policy, and models for urban decarbonization. Douglas Almond, Professor of Economics and International and Public Affairs and Bianca Howard, Assistant Professor of Mechanical Engineering contributed insights.

Session Recap: Innovation and Sustainability

100+ Posters and Demos from Faculty, Researchers, and Students

The event also featured a lively poster and demo session, with over 100 projects representing research in areas from cancer screening to political analysis, and language processing. The event ended with a network session, providing all members with the opportunity to interact with the faculty in data science at Columbia, rising scholars and industry leaders, underscoring the commitment of Columbia University to the mantra, data for good.