Computational Trade-Offs in Statistical Learning: An Optimization Perspective

The past decade has seen the emergence of datasets of an unprecedented scale, with both large sample sizes and dimensionality. Massive data sets arise in various domains, among them computer vision, natural language processing, computational biology, social networks analysis and recommendation systems, to name a few. In many such problems, the bottleneck is not just the number of data samples, but also the computational resources available to process the data. Thus, a fundamental goal in these problems is to characterize how estimation error behaves as a function of the sample size, number of parameters, and the computational budget available.

In this talk, I present three research threads that provide complementary lines of attack on this broader research agenda: (i) lower bounds for statistical estimation with computational constraints through oracle complexity of stochastic convex optimization;

  1. Interplay between statistical and computational complexities in structured high-dimensional estimation; and
  2. Distributed convex optimization algorithms for large scale machine learning. The first characterizes fundamental limits in a uniform sense over all methods, whereas the latter two provide explicit algorithms that exploit the interaction of computational and statistical considerations.

[Joint work with John Duchi, Sahand Negahban, Clement Levrard, Pradeep Ravikumar, Peter Bartlett and Martin Wainwright]

Speaker Details

Alekh Agarwal is a fifth year PhD student at UC Berkeley, jointly advised by Peter Bartlett and Martin Wainwright. Alekh has received PhD fellowships from Microsoft Research and Google. His main research interests are in the areas of machine learning, convex optimization, high-dimensional statistics, distributed machine learning and understanding the computational trade-offs in machine learning problems.

Date:
Speakers:
Alekh Agarwal
Affiliation:
Berkeley