Aggify: Lifting the Curse of Cursor Loops using Custom Aggregates
Loops that iterate over SQL query results are quite common, both in application programs that run outside the DBMS, as well as User Defined Functions (UDFs) and stored procedures that run within the DBMS. It can be argued that set-oriented operations are more efficient and should be preferred over iteration; but from real world use cases, it is clear that loops over query results are inevitable in many situations, and are preferred by many users. Such loops, known as cursor loops, come with huge trade-offs and overheads w.r.t. performance, resource consumption and concurrency.
We present Aggify, a technique for optimizing loops over query results that overcomes these overheads. It achieves this by automatically generating custom aggregates that are equivalent in semantics to the loop. Thereby, Aggify completely eliminates the loop by rewriting the query to use this generated aggregate. This technique has several advantages such as: (i) pipelining of entire cursor loop operations instead of materialization, (ii) pushing down loop computation from the application layer into the DBMS, closer to the data, (iii) leveraging existing work on optimization of aggregate functions, resulting in efficient query plans. We describe the technique underlying Aggify, and present our experimental evaluation over benchmarks as well as real workloads that demonstrate the significant benefits of this technique.
Harmonizing the declarative and imperative in database systems
Most relational database engines allow users to express their intent with both declarative SQL and imperative functions/procedures, and practitioners often combine the two in database applications. But while today’s database systems employ highly sophisticated techniques to optimize and evaluate declarative SQL statements, the evaluation of imperative programs has largely remained naive and inefficient. This has limited their use in many performance-critical situations despite imperative programming offering several benefits over SQL.
In this webinar, Karthik Ramachandra, a Principal Engineering Manager who heads the Azure SQL Database R & D India organization at Microsoft, will take you on a journey addressing this important but often overlooked problem. First, he’ll describe how the declarative and imperative styles of programming are intertwined in today’s database systems and explain the challenges faced by practitioners. Then, he’ll address why this problem has lingered without a solution for years and what makes it unique. He’ll also delve into the details of Froid and Aggify, two related techniques for evaluating imperative programs in database systems that result in performance improvements of up to multiple orders of magnitude over the existing state of the art.
Together, you’ll explore:
- The interplay of declarative and imperative styles of programming in database systems
- The root cause of the performance disparity between these two styles
- The recent developments with Froid and Aggify, which blend technologies from compilers and database query optimizers seamlessly to get the best of both programming styles
- An experimental evaluation that demonstrates how these techniques harmonize the two disparate programming styles for a large class of database applications
Resource list:
- Froid homepage (opens in new tab)
- Aggify homepage (opens in new tab)
- Optimizing imperative functions in relational databases with Froid (opens in new tab) (Blog)
- Introducing Scalar UDF Inlining (opens in new tab) (Blog)
- Froid and the relational database query quandary with Dr. Karthik Ramachandra (opens in new tab) (Podcast)
- Froid: Optimization of Imperative Programs in a Relational Database (opens in new tab) (Paper)
- BlackMagic: Automatic Inlining of Scalar UDFs into SQL Queries with Froid (opens in new tab) (Paper)
- Aggify: Lifting the Curse of Cursor Loops using Custom Aggregates (opens in new tab) (Paper)
- Invited talk at University of Washington, Seattle, March 2019 (opens in new tab)
*This on-demand webinar features a previously recorded Q&A session and open captioning.
Explore more Microsoft Research webinars: https://aka.ms/msrwebinars (opens in new tab)