A lot has been said and written about table variables vs. temporary tables in SQL Server. But I still meet developers and DBAs who are not familiar with the subject and write inefficient code as a result. I have a one-hour session called "The Complete Guide to Temporary Tables and Table Variables", and I figured it's time to write the blog post version of it. But since it's going to be a very long blog post, I decided to break it down into several parts.
So here is part 1. We begin with a...
Basic Comparison
Storage: There is a common myth that table variables are stored only in memory, but this is not true. They are stored in tempdb in the same way as temporary tables, with the same allocation and deallocation mechanisms. No difference.
Scope: Table variables are deallocated as soon as the batch is completed. If you try to declare a table variable and then use it in a separate batch, you will receive an error that it doesn't exist. Temporary tables, on the other hand, continue to exist as long as the session is still open. They are deallocated automatically when the session is closed or manually using the DROP TABLE statement. You can also create a global temporary table (##), which is accessible to all sessions, but this is kind of an edge case, so I'll stick to local temporary tables in this blog post.
Constraints: A temporary table has the same semantics as a regular table, which means you can use the same DDL statements to create, alter and drop temporary tables. It also means that you can add constraints to a temporary table in the same way that you would for e regular table. On the other hand, a table variable is just like any other type of local variable, which means you can only use the DECLARE statement to declare it once, and that's it. But you can still add constraints to the table, such as unique and check constraints, as part of the initial DECLARE statement.
Indexes: Everything I wrote above about constraints is the same for indexes. But here there is a tricky part. Usually, it makes more sense to create the temporary table first, then load it with data, and then create the necessary indexes on it. This will speed up the load phase, and it will also result in brand new indexes without any fragmentation. You can do this with temporary tables, of course. But with table variables, you must create the indexes as part of the DECLARE statement, which is before you load the data. This will result in a slower loading time as well as fragmented indexes, in most cases.
Statistics: Now, this is the biggest difference between the two, and I'll cover it in-depth in the following section when we deal with performance. It's quite simple: temporary tables have statistics, just like regular tables. Table variables don't have statistics at all. The cardinality estimator doesn't have a clue about the contents of table variables. In most cases (more details later), it doesn't even know the total number of rows in the table, and it will estimate a single row regardless of the actual number of rows. This can (and often does) lead to the wrong execution plan and to poor performance.
Recompilations: Statistics are really important for query performance, but they don't come for free. Whenever a statistics object gets updated, all the relevant statements that reference the relevant column(s) and might be affected must be recompiled to generate a new plan based on the new statistics. This means that for temporary tables, since they have statistics, they also have statistics updates and recompilations, and sometimes they can be quite impactful (more details later). But for table variables, there are no statistics updates and no recompilations (at least no recompilations that are caused by statistics updates).
Transactional Consistency: If you make changes to a temporary table within a transaction, and then the transaction is rolled back, then the changes you made to the temporary table are also rolled back. This is not the case with table variables. When a transaction is rolled back, the table variable doesn't change. It is not transactionally consistent.
Transaction Log Footprint: This is the Implication of the previous section. For a temporary table to be transactionally consistent, it has to be fully logged in tempdb to support the rollback operation. On the other hand, table variables are minimally logged, which results in fewer write operations against the tempdb transaction log file and better performance as a result.
Table-Valued Parameters: It can be very useful to pass a whole table as a parameter to a stored procedure or a function. You need to create a table type first, and then you can declare parameters with that type. You can then pass tables as parameters, but only with table variables. You can't create a temporary table from a table type nor you can pass it as a parameter. I will discuss this in more detail later in the blog post.
Parallelism: A plan operator that modifies data in a temporary table, such as an insert operator, can be serial or parallel, and the optimizer is free to choose the best approach. With a table variable, such operators can only run in serial mode. This is a limitation of SQL Server. If you try to insert a lot of data into a table variable, then it might take a lot of time, because only a single thread can insert the data.
Now that we understand all the differences, the big question is, of course:
When to Use What?
As long as we can get the same functionality, the main consideration is performance. Regarding functionality, the only differences are scope, transactional consistency, and table-valued parameters.
If your design requires that a table is accessible from a different scope (e.g. from another stored procedure), then you must use a temporary table. But I don't recommend such a design unless you have a good reason. Because it creates a dependency between different scopes, and this usually leads to higher maintenance costs. Instead, you should strive to design each batch as an independent black box with a well-known input and output. This is where table-valued parameters can become handy. I will discuss table-valued parameters later in this blog post.
As for transactional consistency, in most cases, you want your tables to be consistent, and this includes temporary tables. This means that you should use temporary tables rather than table variables. I've seen developers use table variables to store contextual information within a TRY block, and then use this information inside the CATCH block, even after a rollback. This is something you can't do with a temporary table. If this is your use case, then use a table variable. But in most cases, the 6 error functions accessible within the CATCH block will give you all the information you need.
The interesting part is performance, and this is what part 2 of this blog series is all about...
Comments