We study the classical problem of prediction with expert advice in the adversarial setting with a geometric stopping time. Cover (1965) gave the optimal algorithm that minimizes worst-case regret for the case of 2 experts. In this talk, I will describe the optimal algorithm, adversary and regret for the case of 3 experts. We will see that optimal algorithm for 2 and 3 experts is a probability matching algorithm (analogous to Thompson sampling) against a particular randomized adversary. Remarkably, it turns out that this algorithm is not only optimal against this adversary, but also minimax optimal against all possible adversaries. A novel aspect of our analysis is that upper and lower bounds are proved simultaneously, analogous to the primal-dual method. The analysis of the optimal adversary relies on delicate random walk estimates. At the end of the talk, I will discuss the case of "Bandit feedback", when we just learn the gain of the action we chose, and analyze the effects of imposing a switching cost.(Talk based on joint works with Nick Gravin and Balu Sivan and with Ofer Dekel, Jian Ding and Tomer Koren.)

## Date:

Sun, 07/06/2015 - 16:00 to 17:00

## Location:

Elath Hall, 2nd floor, Feldman Building, Edmond J. Safra Campus