Lecture 3: Population recovery under high erasure probability

Date:

Thu, 21/03/201314:30

Lecturer:

Prof. Michael Saks, Rutgers University

In the population recovery problem (introduced by Dvir, Rao, Wigderson and Yehudayoff) there is an unknown probability distribution D over length n binary strings and we want to determine an estimate D* of the distribution such that for each string x, |D*(x)-D(x)| is at most some desired bound b. Samples can be obtained from the distribution, but each sample is obscured as follows: for each sampled string, each bit of the string is independently erased (changed to "?") with some probability q.

In this talk, I'll discuss my recent work with Ankur Moitra showing that for any constant erasure probability q<1, the distribution D can be well approximated using a number of samples (and also computaton) that is polynomial in n and 1/b. This improves on the previous quasi-polynomial time algorithm of Wigderson and Yehudayoff and the polynomial time algorithm of Dvir et al. which was shown to work for q<.7 by Bateman, Impagliazzo, Murray and Paturi. The algorithm we analyze is implicit in previous work on the problem; our main contribution is to analyze the algorithm. Using linear programming duality the problem is reduced to a question about the behavior of polynomials on the unit complex disk, which is answered using the Hadamard 3-circle theorem from complex analysis.