Machine Learning and AI Seminar Series

About

This seminar series invites experts from across the country to come to Columbia and present the latest cutting-edge research in the field of Machine Learning and Artificial Intelligence. Running the gamut between theory and empirics, the seminar provides a single, unified space to bring together the ML/AI community at Columbia. Topics of interest include, but are not limited to, Language Models, Optimization for Deep Learning, Reinforcement and Imitation Learning, Learning Theory, Interpretability and AI Alignment, AI for science, Probabilistic ML, and Bayesian methods.

Hosts: DSI Foundations of Data Science Center; Department of Statistics, Graduate School of Arts and Sciences

Registration

Registration for all CUID holders is preferred. If you do not have an active CUID, registration is required and is due at 12:00 PM the day prior to the seminar. Unfortunately, we cannot guarantee entrance to Columbia’s Morningside campus if you register following 12:00 PM the day prior to the seminar. Thank you for understanding!

Please contact Erin Elliott, DSI Events and Marketing Coordinator at ee2548@columbia.edu with any questions.

Next Seminar

Date: Friday, October 17, 2025 (11:00 AM – 12:00 PM)

Location: Columbia School of Social Work, Room C05

Volodymyr Kuleshov, Joan Eliasoph, M.D. Assistant Professor, Department of Computer Science, Cornell Tech and Cornell University

Title: Discrete Diffusion Language Models

Abstract: While diffusion generative models excel at high-quality image generation, prior work reports a significant performance gap between diffusion and autoregressive (AR) methods on discrete data such as text or biological sequences. Our work takes steps towards closing this gap via a simple and effective framework for discrete diffusion. This framework is simple to understand—it optimizes a mixture of denoising (e.g., masking) losses—and can be seen as endowing BERT-like models with principled samplers and variational estimators of log-likelihood. Crucially, our algorithms are not constrained to generate data sequentially, and therefore have the potential to improve long-term planning, controllable generation, and sampling speed.

In the context of language modeling, our framework enables deriving masked diffusion language models (MDLMs), which achieve a new state-of-the-art among diffusion models, and approach AR quality. Combined with novel extensions of classifier-free and classifier-based guidance mechanisms, these algorithms are also significantly more controllable than AR models. Discrete diffusion extends beyond language to science, where it forms the basis of a new generation of DNA foundation models. Our largest models focus on plants and set a new state of the art in genome annotation, while also enabling effective generation. Discrete diffusion models hold the promise to advance progress in generative modeling and its applications in language understanding and scientific discovery.

Upcoming Seminar Schedule (Fall 2025)

Please save the below dates, times, and locations to attend the seminar series.

Friday, October 24, 2025 (11:00 AM – 12:00 PM)

Location: School of Social Work, Room C05
Speaker: Furong Huang, Associate Professor, Department of Computer Science at the University of Maryland

Friday, November 7, 2025 (11:00 AM – 12:00 PM)

Location: School of Social Work, Room C05
Speaker: Florentin Guth, Faculty Fellow, Center for Data Science, NYU; and Research Fellow, Center for Computational Neuroscience, Flatiron Institute

Friday, November 21, 2025 (11:00 AM – 12:00 PM)

Location: School of Social Work, Room C05
Speaker: Andrej Risteski, Associate Professor, Machine Learning Department, Carnegie Mellon University

Friday, December 12, 2025 (11:00 AM – 12:00 PM)

Location: School of Social Work, Room 311/312
Speaker: Jason Weston, Research Scientist at Facebook, NY and a Visiting Research Professor at NYU

Archive: Speaker Abstracts

Title: Gradient Descent Dominates Ridge: A Statistical View on Implicit Regularization

Abstract: A key puzzle in deep learning is how simple gradient methods find generalizable solutions without explicit regularization. This talk discusses the implicit regularization of gradient descent (GD) through the lens of statistical dominance. Using least squares as a clean proxy, we present two surprising findings.

First, GD dominates ridge regression: with comparable regularization, the excess risk of GD is always within a constant factor of ridge, but ridge can be polynomially worse even when tuned optimally. Second, GD is incomparable with SGD. While it is known that for certain problems GD can be polynomially better than SGD, the reverse is also true: we construct problems, inspired by benign overfitting theory, where optimally stopped GD is polynomially worse. Finally, GD dominates SGD for a significant subclass of problems — those with fast and continuously decaying covariance spectra — which includes all problems satisfying the standard capacity condition.

This is joint work with Peter Bartlett, Sham Kakade, Jason Lee, and Bin Yu.

Talk Date: Friday, October 6, 2025

Machine Learning and AI Seminar Series

About

Registration

Next Seminar

Date: Friday, October 17, 2025 (11:00 AM – 12:00 PM)

Location: Columbia School of Social Work, Room C05

Upcoming Seminar Schedule (Fall 2025)

Adam Block

Samory Kpotufe

Micah Goldblum

Alexis Avedisian

Erin Elliott

Archive: Speaker Abstracts

Machine Learning and AI Seminar Series

About

Registration

Next Seminar

Date: Friday, October 17, 2025 (11:00 AM – 12:00 PM)Location: Columbia School of Social Work, Room C05

Upcoming Seminar Schedule (Fall 2025)

Archive: Speaker Abstracts

Date: Friday, October 17, 2025 (11:00 AM – 12:00 PM)

Location: Columbia School of Social Work, Room C05