{"id":154450,"date":"2004-01-01T00:00:00","date_gmt":"2004-01-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/conditional-selectivity-for-statistics-on-query-expressions\/"},"modified":"2024-02-28T13:28:26","modified_gmt":"2024-02-28T21:28:26","slug":"conditional-selectivity-for-statistics-on-query-expressions","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/conditional-selectivity-for-statistics-on-query-expressions\/","title":{"rendered":"Conditional Selectivity for Statistics on Query Expressions"},"content":{"rendered":"
Cardinality estimation during query optimization relies on simplifying assumptions that usually do not hold in practice. To diminish the impact of inaccurate estimates during optimization, statistics on query expressions (SITs) have been previously proposed. These statistics help directly model the distribution of tuples on query sub-plans. Past work in statistics on query expressions has exploited view matching technology to harness their benefits. In this paper we argue against such an approach as it overlooks significant opportunities for improvement in cardinality estimations. We then introduce a framework to reason with SITs based on the notion of conditional selectivity. We present a dynamic programming algorithm to efficiently find the most accurate selectivity estimation for given queries, and discuss how such an approach can be incorporated into existing optimizers with a small number of changes. Finally, we demonstrate experimentally that our technique results in superior cardinality estimations than previous approaches with very little overhead.<\/p>\n<\/div>\n
<\/p>\n","protected":false},"excerpt":{"rendered":"
Cardinality estimation during query optimization relies on simplifying assumptions that usually do not hold in practice. To diminish the impact of inaccurate estimates during optimization, statistics on query expressions (SITs) have been previously proposed. These statistics help directly model the distribution of tuples on query sub-plans. Past work in statistics on query expressions has exploited […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[{"type":"user_nicename","value":"Nico Bruno","user_id":"43155"},{"type":"user_nicename","value":"Surajit Chaudhuri","user_id":"33764"}],"msr_publishername":"Association for Computing Machinery, Inc.","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"Copyright \u00a9 2007 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and\/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http:\/\/www.acm.org\/dl\/.","msr_conference_name":"SIGMOD","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2004-1-1","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13563],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-154450","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-data-platform-analytics","msr-locale-en_us"],"msr_publishername":"Association for Computing Machinery, Inc.","msr_edition":"","msr_affiliation":"","msr_published_date":"2004-1-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"227956","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2004\/01\/conditional.pdf","id":"227956","title":"conditional.pdf","label_id":"243109","label":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":227956,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2004\/01\/conditional.pdf"}],"msr-author-ordering":[{"type":"user_nicename","value":"Nico Bruno","user_id":43155,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Nico Bruno"},{"type":"user_nicename","value":"Surajit Chaudhuri","user_id":33764,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Surajit Chaudhuri"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[957177],"msr_project":[967236,169456],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":967236,"post_title":"Query Optimization for Database Systems","post_name":"query-optimization-for-database-systems","post_type":"msr-project","post_date":"2023-12-11 15:19:29","post_modified":"2023-12-11 15:19:32","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/query-optimization-for-database-systems\/","post_excerpt":"The query optimizer is a crucial component in a relational database system and is responsible for finding a good execution plan for a SQL query. For cloud database service providers, the importance of query optimization is amplified due to the scale (e.g., millions of databases hosted) and variety of different workloads for which the query optimizer is expected to work well \"out-of-the-box\". Query optimization is challenging due to the richness of SQL queries that contain…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/967236"}]}},{"ID":169456,"post_title":"AutoAdmin","post_name":"autoadmin","post_type":"msr-project","post_date":"2001-11-02 14:41:11","post_modified":"2019-02-05 12:04:17","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/autoadmin\/","post_excerpt":"Database management systems provide functionality that is central to developing business applications. Therefore, database management systems are increasingly being used as an important component in applications. Yet, the problem of tuning database management systems for achieving required performance is significant, and results in high total cost of ownership (TCO). The goal of our research in the AutoAdmin project is to make database systems self-tuning and self-administering. We achieve this by enabling databases to track the…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/169456"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/154450","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/154450\/revisions"}],"predecessor-version":[{"id":1010547,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/154450\/revisions\/1010547"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=154450"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=154450"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=154450"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=154450"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=154450"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=154450"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=154450"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=154450"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=154450"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=154450"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=154450"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=154450"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=154450"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}