Composable Coresets for Constrained Determinant Maximization and Beyond
- Sepideh Mahabadi ,
- Thuy-Duong Vuong
We study the task of determinant maximization under partition constraint, in the context of large data sets. Given a point set V⊂R^d that is partitioned into s groups V_1,...,V_s, and integers k_1,...,k_s where k=∑_i k_i, the goal is to pick ki points from group i such that the overall determinant of the picked k points is maximized. Determinant Maximization and its constrained variants have gained a lot of interest for modeling diversity and have found applications in the context of fairness and data summarization.
We study the design of composable coresets for the constrained determinant maximization problem. Composable coresets are small subsets of the data that (approximately) preserve optimal solutions to optimization tasks and enable efficient solutions in several other large data models including the distributed and the streaming settings. In this work, we consider two regimes. For the case of k>d, we show a peeling algorithm that gives us a composable coreset of size kd with an approximation factor of d^{O(d)}. We complement our results by showing that this approximation factor is tight. For the case of k≤d, we show that a simple modification of the previous algorithms results in an optimal coreset verified by our lower bounds. Our results apply to all strongly Rayleigh distribution and several other experimental design problems. In addition, we show coreset construction algorithms under the more general laminar matroid constraints.