{"id":1553,"date":"2014-01-30T09:00:00","date_gmt":"2014-01-30T17:00:00","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/dataplatforminsider\/2014\/01\/30\/in-memory-oltp-index-troubleshooting-part-ii\/"},"modified":"2024-01-22T22:49:03","modified_gmt":"2024-01-23T06:49:03","slug":"in-memory-oltp-index-troubleshooting-part-ii","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2014\/01\/30\/in-memory-oltp-index-troubleshooting-part-ii\/","title":{"rendered":"In-Memory OLTP Index Troubleshooting, Part II"},"content":{"rendered":"
With the In-Memory OLTP feature in SQL Server 2014 we introduce two new types of indexes for memory-optimized tables: the memory-optimized NONCLUSTERED indexes, and the memory-optimized HASH indexes.<\/p>\n
Memory-optimized NONCLUSTERED indexes behave similar to traditional NONCLUSTERED indexes from an application point-of-view. The main difference being that memory-optimized indexes are always covering (i.e. all columns are virtually included), while with traditional disk-based NONCLUSTERED indexes you need to specify which column you want to include alongside the index key columns.<\/a><\/p>\n Memory-optimized HASH indexes behave different from NONCLUSTERED indexes. They are optimized for point-lookup operations, and do not support ordered scans or inequality seek operations. In addition, you need to specify a BUCKET_COUNT when creating the index, and you need to pick the right value in order for the index to perform optimally. In earlier posts we covered the following three index troubleshooting aspects:<\/p>\n All that said, it seems that HASH indexes have quite a few limitations, compared with NONCLUSTERED indexes. For that reason, it is usually a safer bet to start with NONCLUSTERED indexes, both for new applications and when migrating existing disk-based tables to memory-optimized. You can then\u00a0use HASH indexes to further optimize the workload. Indeed, HASH indexes are the most efficient index for equality search (point lookup) and full table scan operations.<\/p>\n In the remainder of this post we:<\/p>\n The DMV sys.dm_db_xtp_index_stats shows statistics for index operations performed by the in-memory storage engine. The index contains stats about the usage of the index since its creation in memory \u2013 note that memory-optimized indexes are always recreated on database restart. You can use the following query to retrieve key statistics about the usage of indexes on your table:<\/p>\n SELECT<\/span> ix.index_id, ix.name, scans_started, rows_returned<\/span><\/p>\n FROM<\/span> sys<\/span>.dm_db_xtp_index_stats ixs JOIN sys.indexes<\/span> ix ON<\/span> <\/span><\/p>\n ix.object_id<\/span>=ixs.object_id<\/span> AND ix.index_id=ixs.index_id<\/span><\/p>\n WHERE<\/span> ix.object_id<\/span>=object_id<\/span>(‘<table name><\/i>‘<\/span>)<\/span><\/p>\n For troubleshooting indexes, the columns \u2018scans_started\u2019 and \u2018rows_returned\u2019 contain key information:<\/p>\n If the number of rows_returned is significantly larger than the scans_started, this is an indication that, on average, index operations scan a large part of the index. If all index operations are expected to be point lookups, this could be an indication of one of the earlier-mentioned problems where the query\u00a0calls for an operation to be supported by the index, thus causing a revert to full index scan, such as: search requires a subset of hash index key columns<\/i> or search on inequality predicates with a hash index<\/i>.<\/p>\n The scans_started being larger than rows_returned is an indication that the workload is insert-heavy, or that a lot of point lookups failed to locate a row.<\/a><\/p>\n Issue: <\/b>Index keys with many duplicate values can cause performance problems. If each index key has 5 duplicates this is usually not a problem, but if the discrepancy between the number of unique index keys and the number of rows in the tables becomes very large \u2013 more than 10X \u2013 this can become problematic.<\/p>\n All rows with the same index key end up in the same duplicate chain. For hash indexes this can create a lot of overhead in case of hash collisions: if multiple index keys end up in the same bucket due to a hash collision, index scanners always need to scan the full duplicate chain for the first value before they can locate the first row corresponding to the second value. For nonclustered indexes this causes additional overhead for garbage collection.<\/p>\n Symptom: <\/b>For hash indexes the performance of DML operations degrades and CPU utilization increases. In addition, there is an increase in CPU utilization during database startup, and a potential increase in recovery time. This becomes especially clear when inserting a large number of rows.<\/p>\n\n
\n
Troubleshooting seek vs. scan using XTP index DMVs<\/h2>\n
\n
\n
Index keys with many duplicate values<\/h2>\n