Resolving and Preventing Deadlocks in SQL Server

Deadlocks in SQL Server can be frustrating and can cause significant performance and reliability issues. A deadlock occurs when two or more transactions are waiting for each other to release a lock on a resource, resulting in a situation where no transaction can proceed, and eventually, one of them is automatically killed and rolled back. This can happen when two transactions try to access the same data in a different order or when one transaction holds a lock on a resource while waiting for a lock held by another transaction. In this blog post, we'll discuss how to troubleshoot and prevent deadlocks in SQL Server.

Deadlocks can be a significant performance and reliability issue in SQL Server. To troubleshoot and prevent deadlocks, you need to monitor and identify them, optimize queries and database design, and change the "level" at which the resources are locked and their duration. By following these best practices, you can reduce the risk of deadlocks and improve the performance and reliability of your SQL Server.

Monitor and Identify Deadlocks
Make Queries Run Faster
Change the Scope of the Lock
Snapshot Transaction Isolation Levels
Other Transaction Isolation Levels
Conclusion
Additional Resources

1. Monitor and Identify Deadlocks

The first step in troubleshooting deadlocks is to monitor the SQL Server for any occurrences.

SQL Server provides a trace flag (1222) that can be enabled to log detailed information about deadlocks in the SQL Server Error Log. This is not recommended, though, because when you have frequent deadlocks it bloats the Error Log, and that makes navigating and monitoring the Error Log far more cumbersome.

Luckily, SQL Server by default logs all deadlocks in the system_health extended event session.

Deadlocks are also captured by the Default Trace (based on the SQL Profiler tracing technology).

Both of the above can be accessed using a SQL script.

Check out these sample scripts from the Madeira Toolbox on GitHub for this purpose:

Due to their robustness, It's indeed preferable to use the SQL Trace or Extended Events to capture deadlock events. You can even create your own trace if you need additional filtering and custom retention settings beyond what's configured in the Default Trace and system_health.

Once you've identified a deadlock, you need to understand the cause of the problem. Deadlocks are essentially XML documents, but it's possible to view them in a graphical representation using tools such as SSMS and SQL Sentry Plan Explorer. These can be used to identify which queries and resources are involved in the deadlock.

2. Make Queries Run Faster

Once you have identified the queries involved in the deadlock, you need to optimize them to reduce the probability of future deadlocks.

The faster a query will run, the less likely it is to be involved in a deadlock.

Some best practices for query optimization include:

Create covering indexes to reduce the possibility of a deadlock. Your aim should be to reduce table/index scans and key/RID lookup operations. You need to have enough indexes to make your queries fast and access as few objects as possible, but careful not to create so many indexes that they slow things down by making DML commands hold more locks for longer durations.
Retrieve only the minimal data required (e.g. remove from the SELECT clause any columns and objects that you don't actually need).
Rewrite the query to reduce unnecessary data access (e.g. avoid JOINs with tables if you don't actually need data from them, use EXISTS/NOT EXISTS where possible instead of IN/NOT IN, use @@ROWCOUNT after a DML operation instead of counting the rows again using COUNT, and so on).
Avoid using long transactions involving multiple statements.
Avoid using triggers and cascading updates that can cause locks on multiple objects and thus increase the risk of deadlocks.
Create indexes to match the foreign key columns. This way, you can eliminate deadlocks due to cascading deletes/updates.
Tune your transactional code so that queries work through tables in the same predictable order each time (to avoid cases where multiple transactions access objects in a different order).

3. Change the Scope of the Lock

Deadlocks can occur at different "scopes" within SQL Server, depending on the resources accessed by the transactions. Here are some common scope types:

Key
Row
Page
Index Range
Index
HOBT/Partition
Table / Object
Database

Changing the scope at which locks are acquired is an effective strategy for preventing deadlocks in SQL Server. There are two approaches to changing lock scopes: reducing the lock scope or expanding the lock scope.

Reduction of the lock scope:

Reducing the lock scope involves locking fewer data scopes at a time, to prevent multiple sessions from locking the same resource entirely. This can be achieved by using more granular locking hints, such as locking a page instead of a whole index or a small subset of rows instead of a whole table, or a single row instead of multiple rows.

This approach reduces the likelihood of deadlocks by allowing more concurrent transactions to access different resources without conflicts. The risk here is that we may lose transactional consistency because not all of the operation is encapsulated in a single transaction. So, if something fails, we may end up with partial and inconsistent data.

Expansion of the lock scope:

On the other hand, expanding the lock scope involves locking a "larger" object to force multiple sessions to wait for each other instead of interlocking. This can be achieved by using coarser locking hints, such as locking a table instead of a row, or a whole index instead of a page.

This approach reduces the likelihood of deadlocks by minimizing the number of concurrent transactions that can access the same resource at the same time, making your workload more serialized rather than concurrent. The risk here is worse performance due to increased locking (but not deadlocking). In other words, more sessions wait for each other to perform work. So, while it takes them longer, at least they're more likely to complete their work successfully without entirely failing due to a deadlock.

When to expand and when to reduce?

The choice of lock scope depends on the specific situation and the nature of the application. Reducing the lock scope is generally preferable when concurrency is a high priority while expanding the lock scope is preferable when data consistency is a high priority. However, expanding the lock scope can also lead to contention and reduced concurrency (and thus, slower performance), so either one should be done mindfully.

In addition to changing the lock scope, you could also change the duration of the lock. Either by holding the lock for a shorter duration (by optimizing the queries, using shorter transactions, or by splitting your logic into smaller, more manageable units of work). Or, by extending the duration of the lock (by pre-locking critical resources, or by expanding the scope of your transaction). For example, by using logical locks in the form of application locks as a way to implement "mutex locking" and protect "critical sections".

4. Snapshot Transaction Isolation Levels

If the SQL operations involved in the deadlocks are DML (update/insert/delete) versus read operations (SELECT), then it's possible to utilize the "Snapshot" isolation levels to completely get rid of this type of deadlock.

There are two types:

SNAPSHOT ISOLATION - After enabling the feature at the database level, this also requires explicitly changing the transaction isolation level to "Snapshot". It will affect the session until changed.
READ COMMITTED SNAPSHOT ISOLATION - After enabling the feature at the database level, this will automatically affect all transactions running in the "read committed" isolation level, which is the default in SQL Server.

What these isolation levels do is create a "row version" stored in TempDB for each record that's currently being affected by a DML command. Then, any subsequent SELECT queries will access the row's "snapshot" saved in this "TempDB Version Store" instead of accessing the affected row that was not committed yet. In this case, the SELECT statement is not blocked at all.

💡 IMPORTANT NOTE: This will NOT reduce locking between concurrent DML commands. DML commands will still be blocked by other DML commands affecting the same data.

⚠ WARNING: Depending on the business requirements, using snapshot isolation levels may harm data consistency, since the "snapshot" rows retrieved from the TempDB Version Store are "outdated" compared to the data currently being updated (but not yet committed). If your SELECT queries require the most up-to-date data to be retrieved, even at the expense of being blocked by uncommitted DML transactions, then using snapshot isolation levels is not a good idea.

5. Other Transaction Isolation Levels

Similar to the concepts of reducing or expanding the scope of lock, you could also consider either using a more restrictive or a less restrictive transaction isolation level.

Read Committed is the default transaction isolation level in SQL Server. But you could also use other isolation levels to play around with the scope or level of the locks, such as:

Serializable isolation level guarantees consistency by assuming that two transactions might try to update the same data and uses locks to ensure that they do not, but at a cost of reduced concurrency. It basically means 'pre-locking' data that will potentially be accessed by a transaction.

Repeatable Read is placing shared locks on all data that is read by each statement in a transaction and all the locks are held until the transaction completes. As a result other transactions are unable to modify the data that has been read by the current transaction.

Conclusion

Other strategies for preventing deadlocks include normalization of the database scheme and the use of appropriate indexing strategies. It is important to regularly monitor the database for deadlocks and optimize the system accordingly to prevent future deadlocks.

SQL Server deadlocks can significantly impact your database's performance and reliability and, ultimately, your business operations. Implementing the measures we discussed can help mitigate the likelihood of deadlocks and keep your SQL Server instance running optimally.

Additional Resources

Check out these additional resources to learn more about deadlocks: