Foundations of Data Science Workshop (Spring 2025)
Tuesday, April 29, 2025 9:30 am - 1:00 pm
Share:
The Columbia University Data Science Institute’s Foundations of Data Science Center is hosting a workshop designed to foster collaboration and knowledge sharing. Through talks and posters, Columbia researchers will showcase their work in the diverse realms of data science methods and applications.
Event Registration
Tuesday, April 29, 2025 (9:30 AM – 1:00 PM ET) – In-Person Only
Location:Â Columbia Engineering Innovation Hub Address: 2276 12th Ave, New York, NY 10027 – Manhattanville
Event Program
9:30 AM – 10:00 AM: Check-In, Breakfast, and Coffee
10:00 AM – 10:45 AM: Keynote
Christopher Harshaw, Assistant Professor of Statistics, Graduate School of Arts and Sciences, Columbia University
Talk Title: The Conflict Graph Design: Estimating Causal Effects Under Network Interference
Abstract: From political science and economics to public health and corporate strategy, the randomized experiment is a widely used methodological tool for estimating causal effects. In the past 15 years or so, there has been a growing interest in network experiments, where subjects are presumed to be interacting in the experiment and their interactions are of substantive interest. While the literature on interference has focused primarily on unbiased and consistent estimation, designing randomized network experiments to ensure tight rates of convergence is relatively under-explored. Not only are the optimal rates of estimation for different causal effects under interference an open question but previously proposed designs are created in an ad-hoc fashion. In this talk, I will present a new experimental design for network experiments called the “Conflict Graph Design” which, given a pre-specified causal effect of interest and the underlying network, produces a randomization over treatment assignment with the goal of increasing the precision of effect estimation. Not only does this experiment design attain improved rates of consistency for several causal effects of interest, it also provides a unifying approach to designing network experiments. We also provide consistent variance estimators and asymptotically valid confidence intervals which facilitate inference of the causal effect under investigation. Joint work with Vardis Kandiros, Charis Pipis, and Costis Daskalakis at MIT.
10:45 AM – 11:00 AM: Coffee Break
11:00 AM – 11:40 AM: Short Talks
Testing Causal Models with Hidden Variables in Polynomial Delay via Conditional Independencies
Uncertainty Quantification for LLM-Based Survey Simulations
Randomized Quasi-Monte Carlo Features for Kernel Approximation
Distributional Matrix Completion via Nearest Neighbors in the Wasserstein Space
See team member information in the list of exhibitors below.
11:40 AM – 1:00 PM: Poster Session & Lunch
List of Exhibitors & Poster Numbers
P01: Learning Interpretable Optimal Treatment Regimes Using Kolmogorov-Arnold Networks
Faculty Advisor: Youmi Suk, Assistant Professor, Measurement, Evaluation, and Statistics Program, Department of Human Development, Teachers College
Chenguang Pan, PhD Student, Human Development, Teachers College
Yuxuan Li, PhD Student, Human Development, Teachers College
P02: Geometric Causal Models
Faculty Advisor:David Blei, Professor, Statistics and Computer Science, Graduate School of Arts and Sciences and Columbia Engineering
Eli Weinstein, Postdoc, Data Science Institute; and Statistics, Graduate School of Arts and Sciences
P03: Fast, Accurate Manifold Denoising by Tunneling Riemannian Optimization
Faculty Advisor: John Wright, Associate Professor, Electrical Engineering, Columbia Engineering
Mariam Avagyan, PhD Student, Columbia Engineering
Yihan Shen, Undergraduate, Computer Science, Columbia Engineering
Arnaud Lamy
Tingran Wang, PhD Student, MIT
Szabolcs Márka, Professor, Physics, Graduate School of Arts and Sciences
Zsuzsa Márka, Associate Research Scientist, Columbia Astrophysics Laboratory, Graduate School of Arts and Sciences
P04: Scalable Computation of Causal Bounds
Faculty Advisor: Garud Iyengar, Professor, Industrial Engineering and Operations Research, Columbia Engineering; and Avanessians Director, Data Science Institute
Madhumitha Shridharan, PhD Student, Columbia Engineering
P05: Probing adaptive decision-making under uncertainty using extended Hidden Markov Models
Faculty Advisor: Nuttida Rungratsameetaweemana, Assistant Professor, Biomedical Engineering, Columbia Engineering
Rudramani Singha, Research Scientist, Biomedical Engineering, Columbia Engineering
Jared Winslow, MS Student, Statistics, Columbia University
Robert Kim, Neurology, Cedars-Sinai Medical Center
John T Serences, Professor, Psychology, University of California San Diego
P06: Low regret Bayesian learning for Q-functions
Faculty Advisor: Shipra Agrawal, Associate Professor, Industrial Engineering and Operations Research, Columbia Engineering
Priyank Agrawal, PhD Student, Industrial Engineering and Operations Research, Columbia Engineering
P07: ClusterSC: Advancing Synthetic Control with Donor Selection
Faculty Advisor:Rachel Cummings, Associate Professor, Industrial Engineering and Operations Research, Columbia Engineering
Faculty Advisor:Vishal Misra, Professor of Computer Science and Vice Dean Computing and AI, Columbia Engineering
Andrew Tang, PhD Student, Computer Science, Columbia Engineering
Noah Bergam, PhD Student, Computer Science, Columbia Engineering
Saeyoung Rho, PhD Student, Computer Science, Columbia Engineering
P08: Experiment Design for Assortment Optimization
Will Ma, Associate Professor, Decision, Risk, and Operations, Columbia Business School
DreamSports (Dream11)
P09: Adaptive and Efficient Learning with Blockwise Missing and Semi-Supervised Data
Faculty Advisor:Ying Wei, Professor, Biostatistics, Mailman School of Public Health
Molei Liu, Assistant Professor of Biostatistics, Mailman School of Public Health
P10: A real-time EEG neurofeedback platform to predict Attend level via Muse-S
Xiaofu He, Assistant Professor, Clinical Neurobiology, Vagelos College of Physicians and Surgeons
Alfredo Spagna, Lecturer, Department of Psychology
P11: Synthetic Blip Effects: Generalizing Synthetic Controls for the Dynamic Treatment Regime
Faculty Advisor: Anish Agarwal, Assistant Professor, Industrial Engineering and Operations Research, Columbia Engineering
Dwaipayan Saha, PhD Student, Industrial Engineering and Operations Research, Columbia Engineering
Vasilis Syrgkanis, Assistant Professor, Management Science and Engineering, Stanford University
Sukjin Han, Professor of Economics, University of Bristol
Poster Presenters Giving Short Talks
P12 & Short Talk: Testing Causal Models with Hidden Variables in Polynomial Delay via Conditional Independencies
Faculty Advisor:Elias Bareinboim, Associate Professor, Computer Science; and Director, Causal Artificial Intelligence Lab, Columbia Engineering
Adiba Ejaz, PhD Student, Computer Science, Columbia Engineering
Hyunchai Jeong, PhD Student, Computer Science, Columbia Engineering
Jin Tian, Visiting Professor, Computer Science, Columbia Engineering
P13 & Short Talk: Uncertainty Quantification for LLM-Based Survey Simulations
Kaizheng Wang, Assistant Professor, Industrial Engineering and Operations Research, Columbia Engineering
Chengpiao Huang, PhD Student, Industrial Engineering and Operations Research, Columbia Engineering
Yuhang Wu, PhD Student, Decision, Risk, and Operations Division, Columbia Business School
P14 & Short Talk: Randomized Quasi-Monte Carlo Features for Kernel Approximation
Faculty Advisor:Zhiliang Ying, Professor, Statistics, Graduate School of Arts and Sciences
Yian Huang, PhD Student, Statistics, Graduate School of Arts and Sciences
Zhen Huang, PhD Student, Statistics, Graduate School of Arts and Sciences
P15 & Short Talk: Distributional Matrix Completion via Nearest Neighbors in the Wasserstein Space
Faculty Advisor: Anish Agarwal, Assistant Professor, Industrial Engineering and Operations Research, Columbia Engineering
Jacob Feitelberg, PhD Student, Industrial Engineering and Operations Research, Columbia Engineering