Redundant indexes in SQL Server are a phenomenon that is more common than I'd like to admit. I see it in quite a lot of shops. Meaning, that there's still a significant target audience for this blog post! 🤨
This article published by Brent Ozar is very informative about redundant/duplicate indexes, what they mean, why they're bad, and what should be done with them.
Also, a few years ago, Guy Glantser published a post about dropping redundant indexes. It's very useful for finding all redundant indexes within all tables in a specific database.
But what both of these articles are missing - is the ability to easily generate Drop/Disable commands for these redundant indexes.
Additionally, what if there are "similar" indexes that are only "partially" redundant, and therefore it's not enough to simply drop one of them? Otherwise, some queries may suffer a negative performance impact.
Is there a way to take care of all of these issues?
Every single one? Everywhere? All at once?
In this blog post, I am going to show you how to detect, get details on, and be able to drop EVERY redundant index, in EVERY shape and size:
All tables
All tables with a minimum number of rows
Within a specific database
In all accessible databases
IMPORTANT NOTE:
It is important to note that you should only drop indexes that are truly redundant. Before dropping an index, you should ensure that it is not explicitly used by any queries or stored procedures in your database due to a table or query hint.
Let's talk about use cases
There are a few different use cases where there would be redundant indexes.
These use cases would differ from each other based on differences or similarities of a few index properties:
Key columns
Include columns
Filter definition
Other properties (data compression, index locking options, fill factor, filegroup)
For the purpose of this article, I will assume the following:
If two indexes have different filter definitions - they are NOT duplicates or redundant to each other in any way.
Clustered indexes are never considered redundant, even if their key columns are "contained" inside another index.
Primary keys and unique keys are never considered redundant, even if their key columns are "contained" inside another index.
Some miscellaneous index properties are ignored as part of this check (data compression, index locking options, fill factor, filegroup)
With that out of the way, let's dive in!
Fully Duplicate Index
A fully duplicate index would be one where all key columns and all include columns are identical.
Example:
Let's say we have a table named Products with the following columns:
ProductID (int, Primary Key)
ProductName (nvarchar(50))
CategoryID (int)
Price (money)
And let's say we have created two indexes on this table:
Index 1:
CREATE NONCLUSTERED INDEX IX_Products_CategoryID
ON Products(CategoryID)
INCLUDE(ProductName, Price);
Index 2:
CREATE NONCLUSTERED INDEX IX_Products_CategoryID_2
ON Products(CategoryID)
INCLUDE(ProductName, Price);
Both indexes have the same key column CategoryID and the same include columns ProductName and Price. Therefore, Index 2 is a fully duplicate index of Index 1.
Remediation:
In this case, we can drop Index 2 since it is a duplicate of Index 1 and provides no additional benefit. Dropping the redundant index can help to improve query performance and reduce storage space usage.
Fully Covered Redundant Index
A fully covered index would be one where all key columns of one index are also the first key columns of another index. The other (containing) index may potentially have additional key columns, and/or may have more include columns in addition to those in the redundant index.
The important thing is that the containing index has everything the redundant index has, and possibly more.
Example:
Let's say we have a table named Orders with the following columns:
OrderID (int, Primary Key)
OrderDate (datetime)
CustomerID (int)
EmployeeID (int)
ProductID (int)
Quantity (int)
TotalPrice (money)
And let's say we have created the following two indexes on this table:
Index 1:
CREATE NONCLUSTERED INDEX IX_Orders_CustomerID
ON Orders(CustomerID)
INCLUDE(OrderDate, TotalPrice);
Index 2:
CREATE NONCLUSTERED INDEX IX_Orders_CustomerID_EmployeeID
ON Orders(CustomerID, EmployeeID)
INCLUDE(OrderDate, ProductID, Quantity, TotalPrice);
In this case, Index 2 includes all the columns in Index 1 and has an additional key column EmployeeID. Also, Index 2 has additional include columns ProductID and Quantity.
Therefore, Index 2 is a more comprehensive index than Index 1 and covers all the queries that Index 1 covers, plus additional queries. Index 1 is a redundant index in this scenario since it provides no additional benefit.
Remediation:
We can drop Index 1 without affecting the performance of any queries.
Partially Redundant Index
A partially redundant index is one where its key columns are also the first key columns of another index. But the include columns may be entirely different. This means that the containing index is capable of the same "seek" operation that the redundant index provides, but may need to have a few columns added to its include columns list to fully provide the same functionality. Otherwise, the index will either not be used, or an additional key lookup operator will be added.
Example:
Let's say we have a table named Customers with the following columns:
CustomerID (int, Primary Key)
FirstName (nvarchar(50))
LastName (nvarchar(50))
City (nvarchar(50))
State (nvarchar(50))
ZipCode (nvarchar(10))
Email (nvarchar(100))
Phone (nvarchar(20))
And let's say we have created the following two indexes on this table:
Index 1:
CREATE NONCLUSTERED INDEX IX_Customers_City_State_ZipCode
ON Customers(City, State, ZipCode)
INCLUDE(FirstName, LastName);
Index 2:
CREATE NONCLUSTERED INDEX IX_Customers_City
ON Customers(City)
INCLUDE(FirstName, LastName, Email, Phone);
In this case, Index 2 has the key column City, which is also the first key column in Index 1. However, Index 2 has additional include columns Email and Phone, which are not included in Index 1.
Although Index 2 covers some additional queries that Index 1 does not, it is partially redundant because the key column City is already covered by Index 1. This means that Index 1 is already capable of performing the same "seek" operation that Index 2 provides for queries that only filter on the City column.
Remediation:
To fully provide the same functionality as Index 2, we can add the additional include columns Email and Phone to Index 1. This will eliminate the need for the redundant Index 2 and ensure that the optimal index is used for queries that filter on the City column.
The updated syntax for Index 1 would be:
CREATE NONCLUSTERED INDEX IX_Customers_City_State_ZipCode
ON Customers(City, State, ZipCode)
INCLUDE(FirstName, LastName, Email, Phone);
By dropping the redundant index and updating the containing index with additional include columns, we can reduce storage space usage and make the database more efficient.
WARNING:
Sometimes, when trying to create an "all-covering" index to replace all of its redundant/partially-redundant indexes, you risk creating "wide" indexes that have very long key or include column lists.
While it would still be "beneficial" in terms of storage space usage compared to leaving the redundant indexes as-is, there's still a possible performance risk factor here. This is because the more columns there are in an index, the more data would need to be scanned, read, and parsed during index seek or scan operations.
So, be careful so that you won't end up with huge column lists in your indexes, otherwise, you may end up harming the performance of some queries.
Sounds like a lot of work
You're right, it is a lot of work. But luckily, it's possible to "automate" most of it by writing a TSQL script!
Here, use this script from our Madeira Toolbox:
By default, this script outputs two different resultsets.
The first one is a "detailed" resultset showing all redundant indexes and the indexes that "contain" them. It has a lot of useful details, but could possibly contain duplicate information (if the same index is "contained" in more than one index).
The second resultset is a "summary" resultset showing only the redundant indexes, and each redundant index is shown only once. There is no duplicate information here, but some details are missing (which you can find in the first resultset). You can use this second resultset as your single source of truth for which redundant indexes should be dropped/disabled.
Take note of the columns "*_index_seeks", "*_index_scans", and "*_index_updates" to get an idea about the "popularity" of the redundant indexes compared to their containing counterparts.
Also, take note of the "redundant_index_pages" column as an indication of the current size of the index. Divide this number by 128 to get an equivalent size in MB (a division of the number of pages by 128 is the equivalent of multiplying by 8 to get the value in KB and then dividing by 1024 to get the value in MB). NOTE: In the second (summary) resultset, redundant_index_mb is already calculated for you.
The "DisableCmd" column can be used to quickly get the corresponding ALTER INDEX .. DISABLE commands to disable the redundant indexes. You can also use the "DisableIfActiveCmd" column for an "idempotent" alternative (only disables the index if it exists and is enabled).
The "DropCmd" column can be used to quickly get the corresponding DROP INDEX commands to drop the redundant indexes. It is also idempotent (only drops the index if it exists).
Optionally, changing the variable at the top @CompareIncludeColumnsToo to 0, will somewhat change the behavior of the script:
Instead of comparing the redundancy of indexes based on both key columns and include columns, the script will only compare the key columns. This will likely return even more "redundant" indexes, but you must beware as they are not necessarily "fully" redundant.
An additional 3rd resultset will be returned in the output precisely for this purpose.
It contains a column called "ExpandIndexCommand" which would have CREATE INDEX commands for "Fully Covering Indexes" that should be created to accommodate all redundant indexes based on the key columns. In other words, it takes the index with the most key columns, and adds to it all the include columns from all of its redundant counterparts.
This 3rd resultset also contains two additional columns "DisableRedundantIndexes" and "DropRedundantIndexes" that can be used to easily get rid of all the redundant indexes that would be replaced by the new covering index.
IMPORTANT NOTE:
Before dropping or disabling an index, you must make sure that it is not referenced as part of table or query HINTS. Otherwise, once the index is not available, such queries will start failing.
For this purpose, you can use the script available below:
Also, regardless of whether the index is used because of an explicit hint, or simply because the SQL optimizer has chosen it, it would still be useful to know what are the queries that currently make use of a certain index before you decide to drop it.
Use the query below to find usages of an index within the SQL plan cache:
DECLARE
@IndexName sysname = 'UQ_CountryCodes'
,@TableName sysname = 'CountryCodes'
DECLARE
@IndexNameWithBrackets sysname = QUOTENAME(@IndexName)
,@TableNameWithBrackets sysname = QUOTENAME(@TableName)
;WITH XMLNAMESPACES (DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan', N'http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p)
SELECT
cp.objtype,
cp.usecounts,
st.text,
qp.query_plan
FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp
WHERE st.text LIKE '%' + @TableName + '%'
AND qp.query_plan.exist('//Object[@Index=sql:variable("@IndexNameWithBrackets") and @Table=sql:variable("@TableNameWithBrackets")]') = 1
Similarly, the query below can be used to find usages of an index within the Query Store:
DECLARE
@IndexName sysname = 'UQ_CountryCodes'
,@TableName sysname = 'CountryCodes'
DECLARE
@IndexNameWithBrackets sysname = QUOTENAME(@IndexName)
,@TableNameWithBrackets sysname = QUOTENAME(@TableName)
;WITH XMLNAMESPACES (DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan', N'http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p)
SELECT
q.query_id,
q.query_text_id,
qt.query_sql_text,
qp.query_plan_xml
FROM sys.query_store_query AS q
INNER JOIN sys.query_store_plan AS p ON q.query_id = p.query_id
INNER JOIN sys.query_store_query_text AS qt ON q.query_text_id = qt.query_text_id
OUTER APPLY (SELECT CAST(p.query_plan AS XML) AS query_plan_xml) AS qp
WHERE qt.query_sql_text LIKE '%' + @TableName + '%'
AND qp.query_plan_xml.exist('//Object[@Index=sql:variable("@IndexNameWithBrackets") and @Table=sql:variable("@TableNameWithBrackets")]') = 1
Note that when referenced inside the query plan XML, the index and table names must be encapsulated with square brackets.
Extra Caution
Disabling indexes before dropping them can be a useful way of removing the performance impact of the index (as if it no longer exists), while still retaining its definition so that it could be easily re-enabled. Many organizations would prefer this "safer" route just in case.
To re-enable a disabled index, you'll simply need to REBUILD it.
After disabling the indexes, when you're ready to take the next action, you can use the script below to easily generate either DROP or REBUILD commands for all disabled indexes:
Conclusion
Dropping redundant indexes in SQL Server can greatly improve the performance and efficiency of your database. Redundant indexes can take up valuable storage space, slow down your queries and data modification commands, and increase CPU usage. By identifying and dropping redundant indexes, you can free up space, reduce query overhead, and ensure that the optimal index is used for each query. Keep in mind that not all redundant indexes are obvious, so it's important to regularly review your index usage and remove any that are no longer needed. By taking these steps, you can ensure that your SQL Server database is running at peak performance and efficiency.
Hopefully, this blog post has provided you with the information and tools you need to make this easier for you.
Thanks for sharing this script, I have my version that I use for years but this one seems very good and I will sure check it out.
One important point to keep in mind before dropping any index that we should all remember to verify it is not used as an index hint... (SELECT * FROM sys.sql_modules WHERE definition LIKE '%%')