Modern data science calls for statistical inference algorithms that are both data-efficient and computation-efficient. We design and analyze methods that 1) outperform existing algorithms in the high-dimensional setting; 2) are as simple as possible but efficiently computable.
One line of our work aims to bridge different fields and offer unified solutions to multiple questions. Consider the following canonical statistical inference tasks: distribution estimation, functional estimation, and property testing, sharing the model that provides sample access to an unknown discrete distribution. In a recent paper in NeurIPS '19, we showed that a single, simple, and unified estimator – profile maximum likelihood (PML), and its near-linear time computable variant are sample-optimal for estimating multiple distribution attributes. The result covers 1) any appropriately Lipschitz and additively separable functionals; 2) sorted distribution; 3) R