Nothing Special   »   [go: up one dir, main page]

Active Exploration via Autoregressive Generation of Missing Data

2/13/25 | 4:15pm | E51-376


Daniel Russo

Philip H. Geier Jr. Associate Professor
Columbia Business School


Abstract: We cast the challenges of uncertainty quantification and exploration in online decision-making as a problem of training and generation from an autoregressive sequence model, an area experiencing rapid innovation. Central to our approach is viewing uncertainty as arising from missing outcomes that would be revealed through appropriate action choices, rather than from unobservable latent parameters of the environment. This reformulation aligns naturally with modern machine learning capabilities: we can i) train generative models through next-token prediction rather than fit explicit priors, ii) assess uncertainty through autoregressive generation rather than parameter sampling, and iii) adapt to new information through in-context learning rather than explicit posterior updating. To showcase these ideas, we formulate a challenging informed bandit learning task where effective performance requires leveraging unstructured prior information (like text features) while exploring judiciously to resolve key remaining uncertainties. We validate our approach through both theory and experiments. Our theory establishes a reduction, showing success at offline next-outcome prediction translates to reliable online uncertainty quantification and decision-making, even with strategically collected data. Semi-synthetic experiments show our insights bear out in a news-article recommendation task where article text can be leveraged to minimize exploration.

Bio: Daniel Russo is an Associate Professor in the Decision, Risk, and Operations division of the Columbia Business School. His research lies at the intersection of statistical machine learning and online decision making, mostly falling under the broad umbrella of reinforcement learning. Outside academia, Dan has worked with Spotify to apply reinforcement learning and large language models to audio recommendations. Dan did his undergraduate studies in Math and Economics at the University of Michigan, completed his doctoral studies at Stanford University under the supervision of Benjamin Van Roy, and then worked as a postdoctoral researcher at Microsoft Research in New England. His research has been recognized by the Erlang Prize, the Frederick W. Lanchester Prize, a Junior Faculty Interest Group Best Paper Award, and first place in the George Nicholson Student Paper Competition.

Event Time:
4:15pm – 5:15pm