The Capstone course provides a unique opportunity for students in the M.S. in Data Science program to apply their knowledge of the foundations, theory and methods of data science to address data driven problems in industry and government. In this course, student teams work with representatives from DSI Industry Affiliate companies and Columbia faculty on semester-length data science projects. The resulting projects synthesize the statistical, computational, engineering and social challenges involved in solving complex real-world problems.

Join our event to explore the projects, see demos, and meet with the participating students and mentors. Find project themes and companies below. 

Event Date & Time

Tuesday, May 3 (2:00 PM – 5:00 PM ET) — VIRTUAL

DSI Industry Affiliates have access to Capstone projects following the event. Please reach out to datascience@columbia.edu with any questions about the Capstone program.


2:00 PM: Join the Event. The event will be held on Gatherly, an interactive virtual platform where guests can walk around and meet new people, just like in real life. Attendees can navigate Gatherly floors, designed based on Capstone project topic area, where students will stand by their project to give short presentations and answer questions.

2:05 PM: Introduction from Capstone Faculty. Learn more about the Capstone program and its impact across the Data Science Institute and Columbia University at large.

2:10 PM: Presentations. Presentations will be open until 5:00 PM ET; guests are welcome to float in and out of Gatherly floors to see all of the demos, or focus on exploring projects within areas of interest.

  • FLOOR 1: Geospatial and ML Methodology
  • FLOOR 2: AI-Computer Vision, AI-Speech, Finance, and Time Series
  • FLOOR 3: Natural Language Processing (NLP)

5:00 PM: Event ends.

Navigating the Event

Welcome Floor

Access the DSI help desk, where representatives of our student services team will be available to assist you. Move your mouse to the elevators, where you can head to the floors to see the student projects.

FLOOR 1: Geospatial and ML Methodology

POSTER 1: Who Gets Access to the Internet? 

  • Students: Dashansh Prajapati (dp3085), Santos Hernandez (smh2283), Kevia Q ( kq2153), Brian Hernandez (bmh2168)
  • Mentor: Henning Schulzrinne, Columbia University
  • Category: Geospatial

POSTER 2: Visualizing Fleet Data for Operational Change

  • Students: Xiaoyi Zhang (xz2956), Wanxiao Hong (wh2493), Guangyu Wu (gw2415)
  • Mentors: Rebecca Behle, Fleet Services, New York City Department of Environmental Protection; and Sharon Di, Columbia University
  • Category: Geospatial

POSTER 3: ML Techniques for High Precision and High Explainability

  • Students: Vincent Chang (sc4755), Brian Mao (bm3024), Chris Petty (cp3209), Yuan Wang (yw3585)
  • Mentos: Harsha Aeron, General Electric; and Henry Lam, Columbia University
  • Category: ML Methodology

POSTER 4: Transfer Learning for Control

  • Students: Aaron Aknin (ama2351), Raphaël Adda (rea2157), Saisamrit Surbehera (ss6365)
  • Mentor: Alvaro Velasquez, Air Force Research Laboratory
  • Category: ML Methodology

POSTER 5: Quantifying Contributions of Different Features to Unfairness in AI

  • Students: Xin Ye (xy2509), Yue Xiong (yx2697), Panyu Gao (pg2676), Liyi Zhang (lz2574), Yuzhao Pan (yp2578)
  • Mentors: Sanghamitra Dutta, JPMorgan Chase & Co.; and Rachel Cummings, Columbia University
  • Category: ML Methodology

FLOOR 2: AI-Computer Vision, AI-Speech, Finance, and Time Series

POSTER 6: Table Extraction via Eye Gaze Tracking

  • Students: Yijia Jin (yj2682), Yibai Liu (yl4616), Yeqi Zhang (yz3975), Shihang Wang (sw3275), Yinqiu Feng (yf2579)
  • Mentor: Nonie Thomas, JPMorgan Chase & Co.
  • Category: AI-Computer Vision

POSTER 7: Building Speech Emotion Recognition Systems for Low Resourced Languages

  • Students: Zihan Wang (zw2782), Haifeng Lan (hl3487), Kehao Guo (kg2937), Qi Meng (qm2162), Xinrui Zhang (xz2976)
  • Mentor: Akshat Gupta, JPMorgan Chase & Co.
  • Category: AI-Speech

POSTER 8: Exploring Equity Markets Closing Auction

  • Students: Jiaxi Zhou (jz3280), Haoxiong Su (hs3228), Tianchun Huang (th2884), Hang Luo (hl3434), Yiran Shu (ys3373)
  • Mentor: Lada Kyj, Vanguard 
  • Category: Finance

POSTER 9: Hierarchical Forecasting for External Cost

  • Students: Zhengyi Chen (zc2549), Shiyue Liu (sl4835), Yaojia Ye (yy3084), Anni Chen (ac4779), Qixiao He (qh2232)
  • Mentors: Patricia Vega, Novartis; and Vushal Misra, Columbia University
  • Category: Time Series

POSTER 10: Exploring and Predicting Loss of Exclusivity and its Business Impact

  • Students: Tushar Agrawal (tsa2131), Saloni Gupta Ajay Kumar (sg3910), Moulay-Zaidane Draidia (mad2314), Smarth Gupta (sg3868)
  • Mentor: Sarah Asio, Johnson & Johnson
  • Category: Time Series

FLOOR 3: Natural Language Processing (NLP)

POSTER 11: Topic Directionality in Financial Statements

  • Students: Jingyuan Wang (jw4000), Xuan He (xh2465), Weiwei Jiang (wj2312), Boquan Sun (bs3232), Hanqin Zhou (hz2699)
  • Mentor: Steven Agajanian, PIMCO
  • Category: NLP

POSTER 12: Social Media Product Quality Insights Generation

  • Students: Ying Bi (yb2500), Yiyuan Xu (yx2632), Yuao Zhao (yz3540), Zihao Liu (zl2986), Gaoge Liu (gl2701)
  • Mentors: Korey Phillips; Mathida Chuk; and Tyler Littlefield, Johnson & Johnson
  • Category: NLP

POSTER 13: Performing NLP Tasks on Unstructured Financial Documents

  • Students: Heng Kan (hk312), Huaqing Fang (hf2431), Pengyu Zou (pz2272), Xiaorui Qin (xq2209),  Ziyu Fang (zf2253)
  • Mentor: Simerjot Kaur, JPMorgan Chase & Co.
  • Category: NLP

POSTER 14: Interest Diversity and Brokerage in Networks: A Case Study of Twitter

  • Students: Xinyi Liu (xl3057), Eugenio Beaufrand (eab2271), Zhuoyan Ma (zm2355), Yunshan An (ya2367), Rao Tetal (rht2115)
  • Mentor: Sandra Matz, Columbia University Graduate School of Business
  • Category: NLP

POSTER 15: Data-driven Competitor Product Identification and Ranking

  • Students: Chenhui Mao (cm4054), Nuanyu Shou (ns3492), Mingyan Zou (mz2828), Xingcheng Rong(xr2150), Jiachen Huang (jh4336)
  • Mentor: Syed Haider, Unilever
  • Category: NLP

Capstone Faculty

Sining Chen, Adjunct Professor of Industrial Engineering and Operations Research, Columbia University

Adam S. Kelleher, Adjunct Assistant Professor of Computer Science, Columbia University

Katie Jooyoung Kim (Course Assistant)