The Power of Simple Algorithms: From Data Science to Biological Systems

In this talk I will discuss the power of simple, randomized methods such as hashing, importance sampling, and stochastic iteration in data science and machine learning. In particular, I will overview my efforts to apply these methods to core linear algebraic primitives. I will describe a new class of iterative sampling algorithms, which give state-of-the-art theoretical and empirical performance for regression problems, low-rank matrix approximation, and kernel methods. I will demonstrate that sampling can be surprisingly powerful, giving, for example, sublinear time near-optimal low-rank approximation algorithms for general positive semidefinite matrices. I will conclude by discussing how understanding many of the same simple, randomized algorithms for data applications can be used to study computation in biological systems, including noisy decision making in social insect colonies.

Date:
Speakers:
Cameron Musco
Affiliation:
MIT CSAIL