One of our customers has a complex T-SQL process that involves several long stored procedures and runs for hours. The process fails sometimes, and the customer doesn't know why it fails or where in the code it happens.
We decided to create an event session (Extended Events) to capture errors with all the information we need, so that we can then easily query and analyze the collected data.
This is actually a common solution for many use cases. It simply collects data about SQL errors in your SQL Server instance. One benefit is that you can easily customize it to filter specific errors (e.g. only high-severity errors). Another benefit is that you get the information in a nice tabular format, and you can then continue to manipulate it and format the way you want it.
Here is the script that creates the event session and the query to extract the collected data: https://github.com/MadeiraData/MadeiraToolbox/blob/master/Monitoring%20Scripts/Track%20SQL%20Errors.sql.
A few comments:
I use the "error_reported" event that triggers on any SQL error that is raised in the instance.
I filter on the severity to capture only real errors with severity above 10.
Notice that event timestamps are always in UTC, so I use a combination of SWITCHOFFSET and DATENAME to return the timestamp in the current time zone.
I use the "tsql_frame" action to extract the SQL handle, as well as the statement start & end offset and the line number. I then use sys.dm_exec_sql_text to retrieve the module name and the statement text. If the SQL handle doesn't exist anymore (the plan is no longer in cache), then all of this information is gone. This is why it's important to query the data soon after the event occurred or persist the results in a table to query it later.
Go ahead and try it. Create the event session. Then generate some errors. Then query the data to see how it looks like. Have fun!
Comments