Provable Bounds in Machine Learning

Machine learning is a vibrant field with many rich algorithmic techniques. However, most approaches in the field are heuristic: we cannot prove good bounds on either their performance or their running time, except in quite limited settings. This talk will focus on the project of designing algorithms and approaches whose performance can be analyzed rigorously.

As an example of this project, I will focus on my work on the nonnegative matrix factorization problem, which has important applications throughout machine learning (and theory). As is often the case, this problem is NP-hard when considered in full generality. However, we introduce a sub-case called “separable” nonnegative matrix factorization that we believe is the right notion in various contexts. We give a polynomial time algorithm for this problem, and use this algorithm for more general problems involving learning topic models. This is an auspicious example where theory can lead to inherently new algorithms that have highly-practical performance on real data sets.

I will also briefly describe some of my other work on learning, including mixtures of Gaussians and robust linear regression, as well as promising directions for future work.

Speaker Details

Ankur Moitra is an NSF CI Fellow at the Institute for Advanced Study, and also a senior postdoc in the computer science department at Princeton University. He completed his PhD and MS at MIT in 2011 and 2009 respectively, where he was advised by Tom Leighton. He received a George M. Sprowls Award (best thesis) and a William A. Martin Award (best thesis) for his doctoral and master’s dissertations. He has worked in numerous areas of algorithms, including approximation algorithms, metric embeddings, combinatorics and smoothed analysis, but lately has been working at the intersection of algorithms and learning.

Date:
Speakers:
Ankur Moitra
Affiliation:
MIT