Random<\/strong>: Queries with unpredictable patterns.<\/li>\n<\/ul>\n\n\n\nThese insights, illustrated in Figure 1, form the basis of SIBYL\u2019s ability to forecast query workloads, enabling databases to maintain peak efficiency even as usage patterns shift.<\/p>\n\n\n\nFigure 1. We studied the changing patterns and predictability of database queries by analyzing two weeks\u2019 worth of anonymized data from Microsoft\u2019s telemetry system, which guides decision-making for Microsoft products and services.<\/figcaption><\/figure>\n\n\n\nSIBYL uses machine learning to analyze historical data and parameters to predict queries and arrival times. SIBYL\u2019s architecture, illustrated in Figure 2, operates in three phases:<\/p>\n\n\n\n
\nTraining<\/strong>: It uses historical query logs and arrival times to build machine learning models.<\/li>\n\n\n\nForecasting<\/strong>: It employs pretrained models to predict future queries and their timing.<\/li>\n\n\n\nIncremental fine-tuning<\/strong>: It continuously adapts to new workload patterns through an efficient feedback loop.<\/li>\n<\/ul>\n\n\n\nFigure 2. An overview of SIBYL\u2019s architecture.<\/figcaption><\/figure>\n\n\n\nChallenges and innovations in designing a forecasting framework<\/h2>\n\n\n\n Designing an effective forecasting framework is challenging, particularly in managing the varying number of queries and the complexity of creating separate models for each type of query. SIBYL addresses these by grouping high-volume queries and clustering low-volume ones, supporting scalability and efficiency. As demonstrated in Figure 3, SIBYL consistently outperforms other forecasting models, maintaining accuracy over different time intervals and proving its effectiveness in dynamic workloads.<\/p>\n\n\n\nFigure 3. SIBYL-LSTM’s accuracy compared with other models in forecasting queries for the next time interval.<\/figcaption><\/figure>\n\n\n\nSIBYL adapts to changes in workload patterns by continuously learning, retaining high accuracy with minimal adjustments. As shown in Figure 4, the model reaches 95% accuracy after fine-tuning in just 6.4 seconds, nearly matching its initial accuracy of 95.4%.<\/p>\n\n\n\nFigure 4. Fine-tuning results on telemetry workload changes.<\/figcaption><\/figure>\n\n\n\nTo address slow dashboard performance, we tested SIBYL by using it to create materialized views\u2014special data structures that make queries run faster. These views identify common tasks and recommend which ones to store in advance, expediting future queries.<\/p>\n\n\n\n
We trained SIBYL using 2,237 queries from anonymized Microsoft sales data over 20 days, enabling us to create materialized views for the following day. Using historical data improved query performance 1.06 times, while SIBYL\u2019s predictions achieved a 1.83-time increase. This demonstrates that SIBYL\u2019s ability to forecast future workloads can significantly improve database performance.<\/p>\n\n\n\n
Implications and looking ahead<\/h3>\n\n\n\n SIBYL\u2019s ability to predict dynamic workloads has numerous applications beyond improving materialized views. It can help organizations efficiently scale resources, leading to reduced costs. It can also improve query performance by automatically organizing data, ensuring that the most frequently accessed data is always available. Moving forward, we plan to integrate more machine learning techniques, making SIBYL even more efficient, reducing the effort needed for setup, and improving how databases handle dynamic workloads, making them faster and more reliable.<\/p>\n\n\n\n
Acknowledgments<\/h2>\n\n\n\n We would like to thank our paper co-authors for their valuable contributions and efforts: Jyoti Leeka, Alekh Jindal, and Jishen Zhao.<\/p>\nOpens in a new tab<\/span>","protected":false},"excerpt":{"rendered":"SIBYL is a machine learning model that makes highly accurate predictions of database queries, enabling tuning for more efficiency. Applying traditional database optimizations to these predicted queries helps maintain high performance as demands change.<\/p>\n","protected":false},"author":37583,"featured_media":1041537,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"categories":[1],"tags":[],"research-area":[13563],"msr-region":[],"msr-event-type":[],"msr-post-option":[243984],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[684024],"related-projects":[],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Rana Alotaibi","user_id":42168,"display_name":"Rana Alotaibi","author_link":"Rana Alotaibi<\/a>","is_active":false,"last_first":"Alotaibi, Rana","people_section":0,"alias":"ranaalotaibi"},{"type":"guest","value":"hanxian-huang","user_id":"1041612","display_name":"Hanxian Huang","author_link":" Hanxian Huang<\/a>","is_active":true,"last_first":"Huang, Hanxian","people_section":0,"alias":"hanxian-huang"},{"type":"user_nicename","value":"Tarique Siddiqui","user_id":39645,"display_name":"Tarique Siddiqui","author_link":" Tarique Siddiqui<\/a>","is_active":false,"last_first":"Siddiqui, Tarique","people_section":0,"alias":"tasidd"},{"type":"user_nicename","value":"Carlo Curino","user_id":31352,"display_name":"Carlo Curino","author_link":" Carlo Curino<\/a>","is_active":false,"last_first":"Curino, Carlo","people_section":0,"alias":"ccurino"},{"type":"user_nicename","value":"Jes\u00fas Camacho Rodr\u00edguez","user_id":40693,"display_name":"Jes\u00fas Camacho Rodr\u00edguez","author_link":" Jes\u00fas Camacho Rodr\u00edguez<\/a>","is_active":false,"last_first":"Camacho Rodr\u00edguez, Jes\u00fas","people_section":0,"alias":"jesusca"},{"type":"user_nicename","value":"Yuanyuan Tian","user_id":40708,"display_name":"Yuanyuan Tian","author_link":" Yuanyuan Tian<\/a>","is_active":false,"last_first":"Tian, Yuanyuan","people_section":0,"alias":"yuanyuantian"}],"msr_type":"Post","featured_image_thumbnail":" ","byline":"","formattedDate":"June 11, 2024","formattedExcerpt":"SIBYL is a machine learning model that makes highly accurate predictions of database queries, enabling tuning for more efficiency. Applying traditional database optimizations to these predicted queries helps maintain high performance as demands change.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1041513"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/37583"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1041513"}],"version-history":[{"count":39,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1041513\/revisions"}],"predecessor-version":[{"id":1048593,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1041513\/revisions\/1048593"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1041537"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1041513"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1041513"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1041513"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1041513"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1041513"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1041513"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1041513"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1041513"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1041513"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1041513"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}