Friday, April 17, 202611:00 am - 12:00 pm
Danqi Chen, Associate Professor of Computer Science, Co-Leader of Princeton NLP Group, Associate Director of Princeton Language and Intelligence, Princeton University
Location: Hamilton Hall, Room 702
Abstract: Language models’ context sizes have rapidly increased from thousands to millions of tokens, reshaping how we build and use these models. In this talk, I will trace this evolution along three dimensions: (1) how we think about training long-context language models from data (and architecture) perspectives, (2) how our evaluation and applications have shifted — from synthetic retrieval tests to test-time scaling and long-horizon agents, and (3) how we should rethink inference and scaffolding to make better use of long context, beyond naively filling the context window. I will draw on recent work from our group on long-context model training, evaluation, and effective context management for long-horizon agentic tasks.