About

This seminar series invites experts from across the country to come to Columbia and present the latest cutting-edge research in the field of Machine Learning and Artificial Intelligence. Running the gamut between theory and empirics, the seminar provides a single, unified space to bring together the ML/AI community at Columbia. Topics of interest include, but are not limited to, Language Models, Optimization for Deep Learning, Reinforcement and Imitation Learning,  Learning Theory, Interpretability and AI Alignment, AI for science, Probabilistic ML, and Bayesian methods.

Hosts & Co-Sponsors: DSI Foundations of Data Science Center; Department of Statistics, Arts and Sciences, Columbia Engineering

Registration

Registration for all CUID holders is preferred. If you do not have an active CUID, registration is required and is due at 12:00 PM the day prior to the seminar. Unfortunately, we cannot guarantee entrance to Columbia’s Morningside campus if you register following 12:00 PM the day prior to the seminar. Thank you for understanding!

Please contact Erin Elliott, DSI Events and Marketing Coordinator at ee2548@columbia.edu with any questions.

Next Seminar

Date: Friday, March 27, 2026 (11:00 AM – 12:00 PM)

Location: Hamilton Hall, Room 702

Eric Wong, Assistant Professor, Computer and Information Science, University of Pennsylvania

Title: A Mechanistic Theory of Safety: How Jailbreaking 1-Layer Transformers Taught us how to Steer LLMs

Abstract: Why are LLM guardrails fundamentally so easily broken, and how can we enforce them? This talk formalizes a mechanistic theory for studying safety problems. We begin with one-layer transformers, identifying rule-breaking as an inherent architectural vulnerability in the model’s attention mechanism. This mechanistic theory framework (LogicBreaks) taught us a critical lesson: if attention is the key to breaking rules, it may also be the key to enforcing them.

Building upon this insight, we expand the mechanistic theory to analyze attention-based interventions, arriving at InstaBoost: an incredibly simple yet highly effective steering method that boosts the model’s attention on user-provided instructions during generation. This technique, developed from analysis on one-layer transformers, provides state-of-the-art control over large-scale LLMs with just five lines of code. 

Register

Upcoming Seminar Schedule (Spring 2026)

Please save the below dates and times to attend the seminar series.

Friday, April 10 (11:00 AM – 12:00 PM) 

  • Location: Hamilton Hall, Room 702
  • Speaker: Greg Durrett, Associate Professor, Computer Science Department and Center for Data Science, NYU Courant
  • Register

Friday, April 17 (11:00 AM – 12:00 PM)

  • Location: Hamilton Hall, Room 702
  • Speaker: Danqi Chen, Associate Professor of Computer Science, Co-Leader of Princeton NLP Group, Associate Director of Princeton Language and Intelligence, Princeton University
  • Register