Kate Smith, Author at Microsoft SQL Server Blog

Cardinality Estimation Feedback in SQL Server 2022

Kate Smith — Thu, 01 Dec 2022 16:00:00 +0000

Part of the SQL Server 2022 blog series.

Cardinality estimation (CE) is a process used by the query optimizer to estimate the number of rows returned by a query or sub-part of a query. These estimates rely on two underlying things: statistics about the data and a set of assumptions—known as the model. The model determines how we interpret statistics and data coming from various sub expressions, and so, for any given set of statistics, different models will arrive at different cardinality estimates.

SQL Server 2022

The most Azure-enabled release yet, with continued performance, security, and availability innovation.

Learn more

Until SQL Server 2022, CE could only have one model. The set of assumptions used was baked into the code of the server, and so whatever model was picked is what was used for all queries. However, we know that not all queries and workloads perform best under one single model. For some queries, the model we use works out well, but for others, a different model would perform better. With CE Feedback in SQL Server 2022, we can now tailor the model used to generate a query play to the specific query.

CE has always had three basic assumptions that comprise the model: independence (or partial independence), uniformity, and containment. These three assumptions determine how we interpret histograms, and they determine how we combine data during joins or in the presence of multiple predicates. In this blog, I will explain these model variants and what they mean in more detail.

Uniformity

Let’s begin by discussing uniformity assumption. This assumption is used when interpreting data from the on-disk histograms—abstracted data about the columns being queried. We assume that all data within steps, or buckets, of a histogram is uniformly distributed at an average frequency for that bucket. Thus, when we query data, this allows us to determine the number of rows that satisfy the predicate.

Now, CE Feedback modifies the uniformity assumption only in one special case—that of Row Goal queries. These queries look like TOP n, or Fast n, or IN. and there is a special optimization for row goal queries that relies on the independence assumption. Whenever we believe that a particular value occurs at a high enough frequency (based on our interpretation of the histogram using the independence assumption), we choose to do a quick scan of a few pages assuming that we will get enough qualifying rows very quickly. However, if the data is skewed, we may have falsely assumed more qualifying values than were actually present. This means we scan far more pages than expected to get the requisite number of rows.

CE Feedback can detect such scenarios and turn off the special row goal optimization. If it turns out that the query is indeed faster without this assumption, we keep this change by persisting it in the query store in the form of a query store hint, and the new optimization will be used for future executions of the query.

Independence

Consider a where clause with two predicates, combined with an AND. Something like City=’Seattle” AND State=’WA’. Under the model assumption of independence, we would take the selectivity of the individual predicates (City=’Seattle’, State=’WA’) and multiply those probabilities together. Under the model assumption of correlation, we would take the most selective predicate (City=’Seattle’) and use the selectivity of that predicate only to determine the selectivity of the conjunctive clause. There is a third model of partial correlation, in which we multiply the selectivity of the most selective predicate with a weakened selectivity (raised to a power less than 1, to make the selectivity closer to 1) of the successive predicates.

Figure 1: Independence. P1 and P2 are independent – that is, the truth (or falsehood) of P1 tells us nothing about the truth or falsehood of P2.

Figure 2: Partial Correlation. In cases where we are evaluating P1 = T and P2 = T, you can see that P1 being true gives a higher likelihood of P2 being true. The cases where P1 is false are greyed out because they do not satisfy P1=T.

Figure 3: Complete correlation. When evaluating P1=T and P2 = T, we can see that anytime P1 is true, P2 is also true (P1 implies P2).

CE always starts out with this last model of partial independence (referenced in other places as exponential backoff), but with CE Feedback, we can see if our estimates are too high, meaning the predicates are more independent, or too low, meaning that there is more correlation than expected, and adjust the model used for that query and those predicates accordingly for future executions. If this makes the plan or performance better, we persist this adjustment using a query store hint, and use it for future executions.

Containment

The model assumption of containment means that users query data that is actually in the table. Meaning, if there is a column = constant predicate in the table, we assume that the constant actually exists in the table, at the frequency appropriate for where it falls within the histogram. However, we also assume that there is a containment relationship between joins. Basically, we assume that users wouldn’t join two tables together if they didn’t think there would be matches. However, there are two ways of looking at the containment relationship between joins: Base containment, or Simple containment.

Base containment assumes that there is an inherent relationship between the tables participating in a join but doesn’t make assumptions about the filters occurring on top of those tables before the join occurs. A good example might be a table of store sales and a table of store returns. We would assume that all things returned were also sold, but we would not assume that any filtering on the sales or returns tables makes containment more or less likely—we just assume containment at the base table level and scale the size of the estimated result up or down based on the filters in play.

Simple containment is a bit different—instead of assuming some inherent relationship between the base tables, it assumes that the filters applied to those tables create a containment relationship. For example, querying for graduating seniors from a specific high school and joining that with a query for athletes in a given zip code. While there is some inherent relationship between the two tables a priori, the filters applied specifically limit and create a containment relationship.

CE starts with the base containment model for all queries. However, if the estimates for the join are ‘off’ in some way—the incoming estimates are good, the outgoing estimates are bad—we try the alternate containment model. When the query is executed again, we try out the other model, and if it is better, we persist it with a query store hint and use it for future executions.

Conclusion

In summary, CE requires a basic set of assumptions that are used to interpret and combine statistical data about histograms or sub-parts of a query. Those assumptions work well for some queries, and less well for others. In SQL Server 2022, we have introduced a method of CE Feedback which adjusts those assumptions in a per-query way, based on actual performance of the query over time.

Learn more

See what’s new in SQL Server 2022.
Try out the Microsoft Learn SQL Server 2022 module and the SQL Server 2022 workshop.

The post Cardinality Estimation Feedback in SQL Server 2022 appeared first on Microsoft SQL Server Blog.

Memory Grant Feedback: Persistence and Percentile Grant

Kate Smith — Thu, 27 Oct 2022 15:00:00 +0000

Part of the SQL Server 2022 blog series.

Memory grant feedback (MGF) is an existing feature in SQL Server, with two critical improvements available in SQL Server 2022: feedback persistence, and percentile grant feedback. These two features enhance the benefits of memory grant feedback as it already existed in SQL Server—allowing for less re-learning of appropriate grants and preventing fluctuating grant requirements from blocking the benefit of memory grant feedback.

Memory grant feedback

This existing feature aims to prevent costly spills or wasteful memory allocation of queries by remembering the memory usage of previous executions of the query and adjusting the grant based on previous data. This helps to prevent spills for underestimates, and to increase throughput of a workload by trimming overly large memory grants to a size that better fits a query.

Memory grant persistence

In prior versions of SQL Server, memory grant feedback data was stored only in the query plan cache. Thus, whenever the plan was evicted from cache, or in the case of failover/server restart, the system would have to re-learn the memory grant feedback from scratch. It seems prudent to store relevant information in the query store. This way it could be retrieved at any time from disk, with no concern for performance degradation after cache eviction or server restart. Because query store is on by default in SQL Server 2022, this feature is an obvious win for improving the overall effectiveness of memory grant feedback.

Percentile grant feedback

Prior to SQL Server 2022, memory grant feedback was only determining the current grant adjustment based on the single most recent execution of the query. However, in some cases—especially those where there is a cached plan for a stored procedure in which different parameters cause vastly different result set sizes (and thus vastly different memory requirements)—this can trigger a severe anti-pattern of alternating request sizes and always-wrong memory grant adjustments. This pattern is shown in the image below:

In this example, the first query execution needs 800MB and is given 800MB. On the second execution, the query requires only 5MB but is given 800MB. MGF will realize that this was a massive over-grant and adjust the subsequent execution grant to 5MB—but we are back the original parameter for which 800MB is needed! When MGF only looks at the single prior execution, this undesirable pattern can occur. Before SQL Server 2022, MGF will detect this scenario and disable itself—noticing that it is not helping to improve the workload.

SQL Server 2022 introduces percentile grant—a way of looking at more than just the single prior execution of a query. With percentile grant, we can look back to the history of executions and adjust memory grants based on a larger set of data points. We always err toward providing more memory to avoid spills, as spills are typically substantially more impactful to the customer than hits to throughput from an oversize grant. Thus, in this parameter-sensitive scenario, we might end up with a pattern of grants and executions that looks more like the following:

In this example, you can see that the first execution required a very low memory grant, but the second execution required a much higher grant, which it was not given due to the first execution. The third execution requires a smaller grant size, but percentile grant (blue line) gives it a grant higher than the last required grant, and the algorithm that we had previously (orange line) would grant only the amount needed in the prior execution. Over time, the blue line fine-tunes itself to the upper limit of the memory grants required by the query, whereas the prior grant algorithm finds itself out of phase with the grant required.

Taken together, these two requirements to memory grant feedback make the feature more reliable and more robust. By persisting the feedback, it becomes robust to failovers, restarts, and cache evictions. By using the new percentile grant algorithm, we are able to respond much more effectively to queries with a widely vacillating grant requirement—allowing the feature to continue to bring benefit to the customer even in pathologically bad scenarios. These two improvements to memory grant feedback are just one of the many intelligent query processing improvements in SQL Server 2022.

Learn More

Download the preview of SQL Server 2022.
See what’s new in SQL Server 2022.

The post Memory Grant Feedback: Persistence and Percentile Grant appeared first on Microsoft SQL Server Blog.

Intelligent Query Processing: degree of parallelism feedback

Kate Smith — Thu, 20 Oct 2022 15:00:00 +0000

Part of the SQL Server 2022 blog series.

DOP inefficiencies are a constant challenge; Current DOP methods are inefficient

The degree of parallelism (DOP) with which a query is executed can greatly impact its performance. Any time a query is using parallelism, there is always the question of if it’s using the right amount of parallelism. Sometimes, if the degree of parallelism is too high, it can introduce inefficiencies into the query execution. If the degree of parallelism is too low, we may be missing out on some of the speed-up that parallelism can provide. Users can manually set a maximum degree of parallelism for a query or for a server using the MAXDOP setting or hint. However, it has been a constant challenge for users to manually determine and tweak the correct degree of parallelism for each query. At most, they would set the MAXDOP specifically when they notice a problem—they don’t typically try to determine the optimal degree of parallelism for each query in their workload.

Degree of parallelism feedback: feature overview

In SQL Server 2022, we introduced a new feature called DOP feedback. This feature will look at any parallel query and determine if it might perform better with a lower degree of parallelism than currently being used. For example, perhaps 16 threads will perform better than 20 if there are a lot of waits on other threads. It will test out the new degree of parallelism and, either decide that this was a good change and keep the 16 threads, or it will revert to previous levels of parallelism and go back to 20 threads. If the new degree of parallelism is good, then this optimization is persisted inside the query store and will be applied appropriately to a query for future executions.

Figure 1: DOP feedback reduces the degree of parallelism in a stepwise fashion, incrementally decreasing the degree of parallelism and verifying at each step.

DOP feedback never increases the degree of parallelism, at best, it will revert to a stable previous DOP, and it works incrementally, meaning instead of trying to drastically lower the degree of parallelism all at once, it will try a slightly lower degree of parallelism. Then if that’s good, it might try another slightly lower degree of parallelism. If the new, even lower degree of parallelism is good, it might try to reduce again down to the degree of parallelism of two, although it will not make a parallel plan become serial. If the new, lower DOP is not as good, we go back to the previous known good DOP and keep the query at that level.

Example

A query is compiled with a degree of parallelism of 32. This means that the query will split off 32 different threads to execute the query. If DOP feedback detects a fair amount of wait times between threads and CPU overhead, it will suggest a lower DOP—say, 20. On the next execution, the query will execute with a DOP of 20. If the performance is better over the next several executions, the DOP of 20 will be considered stabilized. However, DOP feedback may then determine that there are still too many waits and further attempt a DOP of 16. Again, several executions are used to verify the feedback. Then, perhaps, a DOP of 8 is tried. If after several executions the DOP 8 performance is not better, then the system will return to suggesting a DOP of 16 as the most recent, stable, and verified DOP.

Simple setup and easy optimization

With this feature enabled, all of this is done without triggering query recompiles, and without user action.

DOP feedback for SQL 22 addresses a long-held challenge for our customers: finding the right degree of parallelism for each query without having to manually test and tweak each query for optimal performance. This is one of a handful of features in Intelligent Query processing that will provide customers with the best performance and a low-touch to no-touch experience. You can read more about other similar features in the SQL 2022 blog series.

Learn more

Download the preview of SQL Server 2022.
Learn how to configure the max degree of parallelism Server Configuration Option.
See what’s new in SQL Server 2022.

The post Intelligent Query Processing: degree of parallelism feedback appeared first on Microsoft SQL Server Blog.

Intelligent Query Processing: feature family additions

Derek Wilson and Kate Smith — Thu, 15 Sep 2022 15:00:00 +0000

Part of the SQL Server 2022 blog series.

SQL Server 2022 introduces a handful of new members to the growing Intelligent Query Processing (IQP) family. These additions range from expanding and improving various query performance feedback mechanisms to adding two new features—parameter sensitive plans (PSPs) and optimized plan forcing. In this blog, we give a general overview of these additions, with detailed feature-specific blogs to follow.

Feedback features

SQL Server 2022 introduces two new query plan feedback features: degree of parallelism feedback and cardinality estimation feedback, and it improves the overall experience of the existing feedback feature: memory grant feedback.

Memory grant feedback

Memory grant feedback is a feature that was introduced in SQL Server 2017. It looks at executions of a query and determines if the query is using more memory than it was granted (spill to disk), or if it is using way less than it was granted (possibly limiting throughput potential). As a query is executed multiple times, the system learns the amount of memory typically used by the query and adjusts the grant according to the needs of previous executions. In this release we have improved memory grant feedback in two significant ways:

Plan feedback is now persisted on disk. This means feedback is retained across cache evictions and server restarts.

We have improved the algorithm that generates feedback to look at more of the query history before advising. This prevents issues that previously occurred if a plan had widely vacillating memory needs (as might be seen with a parameter sensitive plan).

As a result, users can benefit from memory grant feedback in a wider range of scenarios—and without requiring any re-learning of feedback after a restart, failover, or plan cache eviction.

Degree of parallelism feedback

SQL Server has long allowed users to specify a maximum degree of parallelism (DOP) for a server or for individual queries. However, fine-tuning the most performant DOP for every query in a workload can be time-consuming and arduous, and, in some cases, has not been easily possible without underlying code change to the workload. In SQL Server 2022, DOP feedback can detect opportunities to reduce the DOP for a query. DOP feedback reduces the DOP in steps until the query is running with an amount of parallelism that optimizes CPU usage and query performance.

Many customers do not have the time or expertise to individually tailor the degree of parallelism for each individual query. DOP Feedback does this automatically—reducing CPU and overall query time while still taking advantage of the maximum DOP that works for each individual query.

Cardinality estimation feedback

Cardinality estimation (CE) is a process used by SQL Server to determine how many rows might be returned by a specific part of a query plan. This process incorporates a basic set of assumptions—called the model—to produce the estimates. However, the basic set of assumptions that works best for one query might not work as well for a different query. In previous versions of SQL Server, customers were stuck with one baseline CE model—meaning that there were always some queries that did well and some that did less well but would have performed better with a different model. There have been ways to manually adjust the model, but they required code change as well as in-depth, technical investigation. In SQL Server 2022, however, we can now fine-tune the CE model in a way that is specific to an individual query. This means, that if you have a workload with some queries performing best under one set of model assumptions, and others performing best with a different set of assumptions, all queries can be automatically adjusted by CE feedback to provide the best model for each query in the workload. This is done in a do-no-harm, automated way.

Users no longer have to limit their workload to a single CE model. Users no longer have to manually tweak the CE model for poorly performing queries. CE Feedback will automatically choose the best CE model available for the query, on a per-query basis, and without any manual intervention.

Parameter sensitive plan optimization

SQL Server uses parameters or parameter markers in Transact-SQL statements to increase its ability to match new Transact-SQL statements with existing, previously compiled execution plans and promote plan reuse. This process is also known as “parameter sniffing.” This is a technique used by SQL Server to “sniff” the current parameter values during query compilation and pass those parameters to the Query Optimizer so that the “sniffed” parameter values can be used to generate more efficient query plans. In a lot of scenarios, this process works well but sometimes it can go awry. Parameter sniffing (also known as parameter sensitivity) problems start to occur when a query plan that is efficient for a query given one set of actual parameters may be inefficient for a different set of parameters. For example:

You execute a query with one or more parameterized predicates.
During query compilation, the query optimizer generates a query execution plan based on compile-time parameter values. This plan is then cached and used for subsequent executions.
You or someone else execute the same parameterized query but this time, using different runtime parameter values. The selectivity of those predicates for those runtime parameter values can be quite different from the estimated selectivity based on the compile time parameter values from when that plan was previously compiled. The compiled plan may be non-optimal for some parameter values, causing performance to suffer.

This often occurs when the data distribution within a table or tables is uneven (also known as data skew). There can be many other contributing factors to parameter sniffing problems, such as variable values that are unknown to the optimizer at compile time, code branching, and so on.

At a high level, SQL Server server assumes that at any given point in time there is one optimal plan for a query. That plan may change over time if, for example, enough data changes in a table, or column statistics get updated and change drastically, or someone simply recompiles the query. However, the assumption remains the same; the plan cache will map each query to one plan. The parameter sensitive plan optimization (PSP) feature begins to unlock SQL Server’s ability to cache two or more plans for a query at the same time, with each plan being optimal for a subset of the parameter space. PSP optimization uses the histogram on column-level statistics to identify any non-uniform distributions of data and this is really where the power lies because what we are trying to optimize for is having a plan that aligns more with defined predicate cardinality ranges or “buckets.” This alignment enables the ability to create different and optimal query plans based on the data distribution. For example, in a fictitious real estate agency let’s say that we had a table called PropertySearchByAgent. This table was used to keep track of real estate agents and the number of listings each agent had. If we wanted to view all of the listings for AgentId 1, we could use a parameterized query such as:

sp_executesql N’SELECT * FROM PropertySearchByAgent WHERE AgentId = @AgentId’, N’@AgentId inst’, 1

PSP optimization would determine, based on the predicate for this query WHERE AgentID equal to some value, along with the runtime cardinality of the predicate, that it will “bucketize” and create separate queries (known as query variants) to execute and cache in the plan cache. By using different query variants which map back to the original parameterized query, PSP optimization can create multiple plans for a given query. If we wanted to query the ProperySearchByAgent table for information about AgentID 3, the same parameterized query that was used previously can be used, but this time using a different parameter value of 3. Since Agent 3 has many more listings than Agent 1, PSP optimization would generate a different plan that would be more optimal for the number of listings that Agent 3 has.

As a result, one of the most common performance challenges in SQL Server is one step closer to being resolved completely. This is a foundational capability that will provide great benefits for mission-critical workloads and is foundational and will continue to improve over time.

Optimized plan forcing

Query optimization and compilation is a multi-phased process of quickly generating a “good enough” query execution plan. Query execution time includes the time it takes to compile the query. This can sometimes represent a substantial portion of the overall query execution time and consume significant system resources (such as CPU and memory). To reduce compilation overhead for repeating queries, SQL Server caches query plans for re-use. However, plans can be evicted from the cache due to memory pressure or SQL Server restarts. This can lead to subsequent calls of the previously cached queries to require a new, full compilation cycle. Optimized plan forcing, has been introduced to reduce the complication time for complex queries as well as reduce CPU and memory overhead by reducing the amount of work that the optimization process may have to go through.

What is optimized plan forcing? Optimized plan forcing (or optimization replay) is a new addition to the IQP family whose goal is to speed up the query optimization process for repeating forced queries.
Why would anyone use it? In certain scenarios, the database can use up a lot of resources to deal with the incoming compilation requests (for example, after a failover). To decrease the strain on the database server, optimization replay can be an effective tool.
How does optimization replay help? Optimization replay records the history of a compilation process when a query is executed so that it can be used in the future to speed up the optimization process. It achieves this by recording an “optimization replay script” (ORS) which is a compact representation of useful rules that were applied during the optimization process. Useful rules are those that contribute to generating some expression(s) that becomes a part of the final plan.

Optimization replay (OR) consists of two phases—capture and replay. The capture phase records a compact history of rules that, based on heuristics, are determined to be useful during the optimization process, and stores them inside the Showplan XML attribute in the sys.query_store_plan table within Query Store. We refer to this recording as the optimization replay script (ORS). After the ORS is captured, you typically must mark the query plan as “forced.” Plans can also become automatically forced if the automatic plan correction feature is enabled on the database. Once the plan is marked as “forced” and there is an ORS for the plan and the plan is not already in the plan cache, we kick off the “replay” phase. The replay phase occurs during subsequent executions of the captured query. If the optimization replay requirements are satisfied, the optimization of the query will utilize the ORS to reach the final plan more quickly by skipping the optimization rules that are not part of the ORS.

Conclusion

In conclusion, these five new additions or improvements to the IQP family continue to solve some of the most common pain points customers may experience with performance. The feedback family of features tailors the query performance on a per-query basis in a way designed to reduce poor performance offsets. PSP optimization solves a problem that has long been giving customers issues, and optimized plan forcing speeds up the compilation step when users force plans. Each of these features taken individually improves some pain point or challenge—but the best part is that all these features can be used together seamlessly. Improving a single query in more than one dimension. We are proud to release these new additions to Intelligent Query Processing in SQL Server 2022.

Learn More

Download the preview of SQL Server 2022.
See what’s new in SQL Server 2022.

The post Intelligent Query Processing: feature family additions appeared first on Microsoft SQL Server Blog.