Compressing SQL Workloads
Recently several important relational database tasks such as index selection, histogram tuning, approximate query processing, and statistics selection have recognized the importance of leveraging workloads. Often these tasks are presented with large workloads, i.e., a set of SQL DML statements, as input. A key factor affecting the scalability of such tasks is the size of the workload. In this paper, we present the novel problem of workload compression which helps improve the scalability of such tasks. We present a principled solution to this challenging problem. Our solution is broadly applicable to a variety of workload-driven tasks, while allowing for incorporation of task specific knowledge. We have implemented this solution and our experiments illustrate its effectiveness in the context of two workload-driven tasks: index selection and approximate query processing.
Copyright © 2007 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.