The Capstone course provides a unique opportunity for students in the M.S. in Data Science program to apply their knowledge of the foundations, theory and methods of data science to address data driven problems in industry, government and the non-profit sector. Course activities focus on a semester-length project sponsored by a local organization. The resulting projects synthesize the statistical, computational, engineering and social challenges involved in solving complex real-world problems.

Join our event to explore the projects, see demos, and meet with the participating students and mentors. Find project themes and companies below. 

Event Date & Time

Tuesday, December 14 (2:00 PM – 5:00 PM ET) — VIRTUAL


2:00 PM: Join the Event. The event will be held on Gatherly, an interactive virtual platform where guests can walk around and meet new people, just like in real life. Attendees can navigate Gatherly floors, designed based on Capstone project topic area, where students will stand by their project to give short presentations and answer questions.

2:05 PM: Introduction from Capstone Faculty. Learn more about the Capstone program and its impact across the Data Science Institute and Columbia University at large.

2:10 PM: Presentations. Presentations will be open until 5:00 PM ET; guests are welcome to float in and out of Gatherly floors to see all of the demos, or focus on exploring projects within areas of interest.

  • Floor 1: Natural Language Processing (NLP)
  • Floor 2: Neural Networks & Time Series
  • Floor 3: Fairness & Machine Learning

5:00 PM: Event ends.

Navigating the Event

Welcome Floor

Access the DSI help desk, where representatives of our student services team will be available to assist you. Move your mouse to the elevators, where you can head to the floors to see the student projects.

Floor 1: Natural Language Processing (NLP)

POSTER 1: Using Natural Language Processing to Discover COVD-19 Impacts on Birthing Families from Social Media

  • Barnard
  • Mentors: Adam Poliak, Caitlin Dreisbach
  • Students: Xiaoyan Li, Neha Santhoshi Pusarla, Miranda Gao Zhou, Lu Bin Liu, Gaoyi Shi

POSTER 2: The Power of Peace Speech

  • Earth Institute | LD
  • Mentor: Peter Coleman
  • Students: Haoyue Qi, Yuxin Zhou, Xuanhao Wu, Hongling Liu, Wenjie Zhu

POSTER 3: Generating Related Work Sections for Scientific Papers: Part 1: Domains

  • Elsevier
  • Mentor: Anita de Waard
  • Students: Tingyi Lu, Yifan Jing, Jiayin Lin, Yuhe Wang

POSTER 4: Measuring Strategic Pivots

  • Graduate School of Business
  • Mentor: Jorge Guzman
  • Students: Heather Zhu, Weiyao Xie, Ningxin Li, Angela Zhou, Sally Bao

POSTER 5: Automated Data Labeling with NLP and Active Learning

  • Pimco
  • Mentors: Steven Agajanian, Kurt Vile, Evan David, Ji Zhang
  • Students: Chelsea Cui, Zhibin Li, Jingyan Xu, Yuan Cheng, Yifei Zhang

POSTER 6: Polling, ObamaCare, Mainstream News Spreading Misinformation

  • Microsoft Research NYC
  • Mentors: David Rothschild
  • Students: Kevin Gao

Floor 2: Neural Networks & Time Series

POSTER 7: Neural Semantic Proto-Role Labeler

  • Bloomberg
  • Mentors: Yuval Marton, Asad Sayeed
  • Students: Sai Thrinath Gunda, Tarun Devireddy, Mitali Bante, Sriram Dommeti

POSTER 8: Graph Machine Learning for Mixed Data Sources

  • JP Morgan
  • Mentors: Naftali Cohen, Srijan Sood, Zhen Zeng
  • Students: Bo Hu, Qin Rui, Wenxuan Liu, Erdong Wang, Shuibenyang Yuan

POSTER 9: Assistive Robot: Recognize and Engage with People

  • JP Morgan
  • Mentors: Naftali Cohen, Srijan Sood, Zhen Zeng
  • Students: Wenjun Cheng, Kenny Jin, Jiongxin Ye, Xiaoyu Su, Yuzheng Jia

POSTER 10: Hierarchical Time Series Forecasting

  • JP Morgan
  • Mentors: Naftali Cohen, Srijan Sood, Zhen Zeng
  • Students: Diyue Gu, Zujun Peng, Yifei Chen, Haichao Yi, Yilan Jiang

POSTER 11: Machine Learning Model for Atomic Structure of Sustainable Energy Materials

  • SEAS
  • Mentor: Simon Billinge
  • Students: Qiran Li, Jingyuan Li, Chaoying Zheng, James Ding, Sidney Fletcher

POSTER 12: Overall Market Earnings Growth Forecasting

  • KPMG
  • Mentor: Nicholas Abell, Ryan Deming, Sydney Son
  • Students: Chenxi Di, Nan Tang, Liyuan Tang, Tianqi Lou, Yuxin Qian

POSTER 13: Impact Estimation of New Competitors in Markets with Simultaneous Events

  • Novartis
  • Mentors: Gerard Sanz-Estape, Laura Rodriguez-Gomez, Javier Cerezo
  • Students: Wendy Qian, Hanlin Tong, Lihui Pan, Zhiheng Jiang, Yishi Wang

POSTER 14: Intraday Volatility

  • Vanguard
  • Mentor: Lada Kyj
  • Students: Sung-Kuk Lim, Young Hoo Cho, Sanket Sunil Gokhale, Minwoo Choi, Chenchao You

Floor 3: Fairness & Machine Learning

POSTER 15: Climate Justice: Quantifying the Impacts of Floods on Socially-Vulnerable People in the US

  • Earth Institute | LD
  • Mentor: Marco Tedesco
  • Students: Tomislav Galjanic, Christodoulos Constantinides, Abhishek Sinha, Samir Char

POSTER 16: Predicting Pharmaceutical Usage and Adverse Effects

  • Goldman Sachs
  • Mentor: Joe Kogan
  • Students: Aditya Koduri, Archit Matta, Karunakar Gadireddy, Shivani Modi, Yosha Singh Tomar

POSTER 17: Predictive Models to Understand Patient Risks in Orthopedics

  • Johnson & Johnson
  • Mentor: Chin-Wen Chang
  • Students: Guotian Zhu, Pan Jiayi, Cai Yiwen, Yidan Gao, Lingxuan Gu

POSTER 18: Algorithmic Fairness in Healthcare

  • Johnson & Johnson
  • Mentor: Thibaut Galvain, Cindy Tong
  • Students: Jingyi An, Jialu Xia, Yuanhang Chen, Dingwen Xie, Run Zhang

POSTER 19: Market Basket Analysis

  • Ralph Lauren
  • Mentor: Nandakumar Sudha
  • Students: Keertan Krishnan, Rahul Agarwal, Rahul Subramaniam, Shaurya Malik, Myles Ingram

POSTER 20: Causality-Informed Fairness Treatments of Unfair AI Systems

  • JPMorgan
  • Mentor: Sanghamitra Dutta, Naftali Cohen
  • Students: Oscar Jasklowski, Yue Wang, Mohammed Aqid Khatkhatay, Xue Gu, Junzhi Ge

Capstone Faculty

Sining Chen, Adjunct Professor of Industrial Engineering and Operations Research, Columbia University

Adam S. Kelleher, Adjunct Assistant Professor of Computer Science, Columbia University