{"id":169456,"date":"2001-11-02T14:41:11","date_gmt":"2001-11-02T14:41:11","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/project\/autoadmin\/"},"modified":"2019-02-05T12:04:17","modified_gmt":"2019-02-05T20:04:17","slug":"autoadmin","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/autoadmin\/","title":{"rendered":"AutoAdmin"},"content":{"rendered":"
Database management systems provide functionality that is central to developing business applications. Therefore, database management systems are increasingly being used as an important component in applications. Yet, the problem of tuning database management systems for achieving required performance is significant, and results in high total cost of ownership (TCO). The goal of our research in the AutoAdmin project is to make database systems self-tuning and self-administering. We achieve this by enabling databases to track the usage of their systems and to gracefully adapt to application requirements. Thus, instead of applications having to track and tune databases, databases actively auto-tunes itself to be responsive to application needs. Our research has led to novel self-tuning components being included in Microsoft SQL Server.<\/div>\n

<\/p>\n

\n

 <\/p>\n

Physical Database Design<\/b><\/h2>\n

One of the primary areas of focus in the AutoAdmin project has been the challenge of automating physical database design<\/b>. Physical database design is the problem of identifying the appropriate physical design structures (e.g. indexes) for a given database and workload. Judicious choice of the physical database design is important since a well-chosen physical design can often improve workload performance by orders of magnitude. It is also a very challenging task that typically requires skilled DBAs who can reason about the trade-offs of creating or dropping physical design structures on the workload. Our research on this problem led to the development of the Index Tuning Wizard<\/b> that shipped as part of Microsoft SQL Server 7.0 in 1998. This work was done in close collaboration with the Microsoft SQL Server product group. The Index Tuning Wizard was the first tool of its kind in any commercial DBMS. Other prominent DBMS vendors followed suit subsequently with similar physical design tuning tools. The architecture of this tool and the underlying algorithms are available in the VLDB 1997 paper “An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server<\/a>“. This paper won the VLDB<\/b> 10-Year Best Paper Award<\/b> in 2007. One of the key aspects of the architecture is an extension to the query optimizer to support a \u201cwhat-if\u201d interface: given a query and a set of hypothetical<\/b> physical design structures (e.g. indexes that do not currently exist), return the plan that the optimizer would have picked if that set of physical design structures were actually materialized. This interface, described in a SIGMOD 1998 paper \u201cAutoAdmin \u2018What-If\u2019 Index Analysis Utility\u201d, allows physical design tools to consider a large space of alternative physical designs without actually materializing the physical design structures. Similar \u201cwhat-if\u201d interfaces were subsequently adopted by other commercial DBMSs as well.<\/p>\n

In Microsoft SQL Server 2000, the Index Tuning Wizard was extended to also recommend materialized views. The VLDB 2000 paper “Automated Selection of Materialized Views and Indexes for SQL Databases<\/a>” describes some of the key technical ideas for dealing with the large space of alternatives that arises with the inclusion of materialized views. In the Microsoft SQL Server 2005 release, the tool was changed from being a wizard to a full blown application called the Database Engine Tuning Advisor <\/b>(DTA). DTA has been used effectively by many real-world customers of Microsoft SQL Server. DTA\u2019s recommendations have resulted in order of magnitude speedups for complex workloads on very large databases, and typically result in comparable performance to hand-tuned physical designs chosen by DBAs. Further information about DTA\u2019s quality of its recommendations and scalability for large and complex real-world workloads can be found in the VLDB 2004 paper “Database Tuning Advisor for Microsoft SQL Server 2005<\/a>“. We have also published details of DTA\u2019s algorithms, including techniques for handling physical design structures such as partitioning, in the SIGMOD 2004 paper “Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design<\/a>“. Finally, in many real-world scenarios, physical design tuning needs to capture rich constraints, e.g., no query in the workload should degrade by more than 10% with respect to the current configuration. Our work on constrained physical design tuning<\/a> won the Best Paper Award<\/b> at VLDB 2008.<\/p>\n

We have also explored other modes<\/b> of physical design tuning. We studied how to enable a low overhead physical design \u201calerter\u201d that could help DBAs answer the question: \u201cIs it worthwhile to tune a database now\u201d? Another interesting problem arises when there is no DBA available (e.g. an embedded database or a small business). In such scenarios, a low touch continuous index tuning approach may become important. We have explored solutions for such scenarios as well in the two papers: \u201cTo Tune or not to Tune? A Lightweight Physical Design Alerter<\/a>\u201d in VLDB 2006 and \u201cAn Online Approach to Physical Design Tuning<\/a>\u201d in ICDE 2007.<\/p>\n

Our initial work on physical database design was awarded the VLDB 10-year Best Paper Award – <\/b>in this context, we wrote an overview article on the progress made in physical design tuning as well as other facets of self-tuning database systems.<\/p>\n

In addition to physical design structures, DTA also recommends appropriate statistics (e.g. histograms on columns of a base table) for the workload. Statistics are crucial for workload performance since the quality of the plan generated by the query optimizer depends heavily on the available statistics. Our work on techniques for determining appropriate statistics for a workload<\/a> won the Best Paper Award<\/b> in ICDE 2001. We have also worked on the problem of exploiting statistics on views<\/b> during query optimization (SIGMOD 2002, ICDE 2003). Using statistics on views can potentially improve the plan quality significantly, since the view can be defined on the result of a query expression, thereby eliminating errors due to assumptions that the optimizer otherwise would need to make.<\/p>\n

Monitoring and Execution Feedback<\/b><\/h2>\n

An important requirement for a self-tuning database system is the ability to efficiently monitor the database and identify potential performance problems. As part of the AutoAdmin project we are investigating some of these issues. We developed a SQL Continuous Monitoring engine <\/a>(SQLCM) in which DBAs can declaratively specify monitoring tasks that they wish to perform and have the engine execute these tasks efficiently at low overhead and with a small memory footprint.<\/p>\n

A functionality that is of great interest to DBAs is estimating the progress of long-running SQL <\/a>queries which are common in decision support systems. Accurate progress estimates can allow DBAs to make judicious decisions on whether to allow a long-running query that is consuming significant resources to complete or to kill it. We defined the model of progress for a SQL query, developed efficient progress estimators and analytically quantified the inherent hardness of the problem.<\/a><\/p>\n

The performance of a query can heavily depend on the quality of the execution plan chosen by the query optimizer. Often, many key parameters used by the optimizer cannot be reliably estimated up-front when the plan is chosen. Thus, we have studied the problem of how feedback from query execution can be used to improve the choice of execution plan of the query optimizer. First, we proposed the concept of Self-Tuning Histograms<\/b> (SIGMOD 1999, SIGMOD 2001) where the statistics (single and multi-dimensional histograms) are built using execution feedback, i.e. cardinalities observed during query execution. There has subsequently been much follow-on research work in the database community on this topic.<\/p>\n

Second, we showed that there are many simple cases where existing mechanisms for obtaining execution feedback are limiting, i.e. they do not help improve the execution plan, no matter how many times the query is executed. We therefore studied how to augment execution feedback mechanisms<\/a> to overcome some of these limitations while keeping the monitoring overheads at query execution time low. For example, using an alternate plan whose cost is not much higher than the current plan, but whose execution will result in obtaining accurate cardinalities for important expressions can yield significant improvements in overall performance.<\/p>\n

Other Topics<\/b><\/h2>\n

We have also recently worked on other problems in the context of the AutoAdmin project:<\/p>\n