Machine-Generated Experimental Designs and The Future of Social Science

Social scientists are increasingly leveraging Generative AI models to learn about political and psychological processes. In political science, economics, and psychology, large language models are being used to classify content, generate experimental stimuli such as persuasive messages, simulate responses to surveys and incentivized experiments, and devise adaptive surveys that respond to participant input. While excitement about these tools is increasing, there is less guidance on how to rigorously incorporate them into research. This program addresses this gap by bringing together scholars developing both novel applications and methodological approaches needed to advance the field.

Registration Request Form

Registration Request Form: Registration will be prioritized for Columbia faculty, postdoctoral researchers, affiliated scholars, and invited external guests. If you are a student, you may be waitlisted to attend until closer to the event date. All who submit a registration request will receive a confirmation email and a calendar invitation if your registration is approved. Thank you again for your interest in attending!


Event Details

Tuesday, March 31, 2026 (2:00 PM – 5:00 PM ET)
In-Person

Location: Columbia School of Social Work – Room C05
Address: 1255 Amsterdam Ave, New York, NY 10027 – Map
Timing: Workshop from 2:00 PM – 4:30 PM; Reception from 4:30 PM – 5:00 PM

· · ─ · ─ · ·

Details below subject to change:

2:00 PM – 2:10 PM: Introductions: Yamil R. Velez, Assistant Professor of Political Science, Arts & Sciences, Columbia University (10 min)

2:10 PM – 2:40 PM: Presentation: Yamil R. Velez (30 min)

2:40 PM – 2:45 PM: Break (5 min)

2:45 PM – 3:15 PM: Presentation: Eli Ben-Michael, Assistant Professor of Statistics & Data Science, Heinz College of Information Systems and Public Policy, Carnegie Mellon University (30 min)

3:15 PM – 3:20 PM: Break (5 min)

3:20 PM – 3:50 PM: Presentation: Kosuke Imai, Professor of Government and Statistics, Harvard University (30 min)

3:50 PM – 3:55 PM: Break (5 min)

3:55 PM – 4:25 PM: Q&A with Speakers (30 min)

4:25 PM – 4:30 PM: Closing Remarks (5 min)

4:30 PM – 5:00 PM: Networking Reception (30 min)

Registration Request Form


Speaker Details

Listed in order of program appearance:

Host & DSI Frontiers Awardee: Yamil R. Velez
Assistant Professor of Political Science, Arts & Sciences, Columbia University

When Personalization Works: Applications of Tailored Experiments using Generative AI

Abstract: I introduce tailored experiments, a design in which treatment stimuli adapt to individual characteristics, and present two applications. In joint work with Patrick Liu and Scott Clifford, I show that large language model-generated counterarguments targeting beliefs respondents identify as central to their attitudes (“focal beliefs”) produce significantly larger and more durable attitude change than counterarguments targeting topically relevant but personally peripheral beliefs (“distal beliefs”), even though both treatments successfully shift belief strength. In a related project, I compare large language model-generated campaign ads tailored to respondents’ demographics, personality traits, or self-identified issue priorities. Consistent with prior work, demographic and personality targeting produce null or negative effects, whereas issue-based targeting increases hypothetical candidate support by more than 10 percentage points relative to the strongest baseline condition. Across both studies, the central finding is consistent: identifying the sources of political preferences provides persuasive leverage. I present tailored experiments as a general framework for building individual-level heterogeneity into experimental design and conclude by showing how recent text deconfounding methods can be incorporated into this workflow.

Eli Ben-Michael
Assistant Professor of Statistics & Data Science, Heinz College of Information Systems and Public Policy, Carnegie Mellon University

AI-assisted Design and Analysis of Experiments with Unstructured Treatments

Abstract: Randomized experiments with unstructured treatments—such as text or images—are common in social science research. However, isolating the causal effect of a focal attribute (e.g., the style of text or facial features in images) is challenging because the attribute is typically correlated with other, non-focal attributes of the treatments. While AI technology could be used in an attempt to change focal attributes of a treatment while keeping all non-focal attributes identical, it offers no guarantees that non-focal attributes are not inadvertently changed in the process, such that confounding can still be an issue. We develop a framework for designing and analyzing experiments that target the isolated effect of a binary attribute of unstructured treatments. We consider designs where treatments are drawn from arbitrary distributions—including hand-crafted treatments, existing databases, or AI systems—and we map the bias of the difference-in-means estimator to the discrepancy in non-focal attributes across treatment arms. We develop a procedure that minimizes this bias via a second-stage rejection sampler that adjusts for observable imbalances in non-focal attributes, without assuming the original distributions correctly isolate the focal attribute. For analysis, we show how to conduct asymptotic inference for the difference-in-means estimator in a finite population setting, where inference is justified by the randomization of treatment. We also develop a calibrated model-assisted estimator that leverages AI predictions while guaranteeing unbiasedness and precision gains over the difference-in-means estimator. We demonstrate our approach through the design of two experiments with text and image treatments. In both cases, naturally occurring treatments exhibit large imbalances in non-focal attributes, and AI-generated treatments induce artifacts that also create observable differences. Our rejection sampler substantially mitigates these imbalances, while our model-assisted estimator effectively uses AI predictions to improving precision in estimating causal effects.

Kosuke Imai
Professor of Government and Statistics, Harvard University

GenAI-Powered Inference

Abstract: We introduce GenAI-Powered Inference (GPI), a statistical framework for both causal and predictive inference using unstructured data, including text and images. GPI leverages open-source Generative Artificial Intelligence (GenAI) models – such as large language models and diffusion models – not only to generate unstructured data at scale but also to extract low-dimensional representations that capture their underlying structure. Applying machine learning to these representations, GPI enables estimation of causal and predictive effects while quantifying associated estimation uncertainty. Unlike existing approaches to representation learning, GPI does not require fine-tuning of generative models, making it computationally efficient and broadly accessible. We illustrate the versatility of the GPI framework through several empirical applications. An open-source software package is available for implementing GPI.