Which technique do you prefer when you create a stored procedure and need to work with temporary data? We know there is no definitive answer to this, but there are workloads where a Table Variable is always preferable to a Temporary Table.
Inhaltsverzeichnis
Workload
The stored procedure of a software manufacturer is executed 24/7 by 12 threads. The customer was assured that the routine can perform at least 250,000 transactions/min.
Note
I am describing a very specific workload. My conclusion is not intended to express general validity. Rather, it shows that both ways of storing temporary data have their advantages and disadvantages!
Problem
All attempts to keep this promise with the customer’s data failed. My customer’s DBA was able to prove that the stored procedure is responsible for the high consumption of resources (CPU / storage).
„It’s your hardware. You need faster CPU, disks and memory. Other customers don’t have the problem described!“. Come on, who hasn’t heard these sayings? Cling your fist and stay polite – the vendor will refuse to work with you if you don’t follow his advice, right?
Analysis
The stored procedure uses a Temporary Table to store data that will be used in the further context. The following – very simple – code of a stored procedure serves as an example.
CREATE OR ALTER PROCEDURE dbo.proc_recompile
AS
BEGIN
SET NOCOUNT ON;
CREATE TABLE #x (c_custkey BIGINT NOT NULL);
ALTER TABLE #x ADD PRIMARY KEY CLUSTERED (c_custkey);
INSERT INTO #x (c_custkey)
SELECT TOP (1) c_custkey FROM dbo.customers;
/* Do the rest of the stuff */
SELECT * FROM #x
WHERE c_custkey <= 0;
END
GO
A trace with Perfmon/Windows Admin Center on an isolated system clearly showed that the procedure creates a new Temporary Object each time it is executed. We ran 12 threads with 100.000 executions each.
Temp Tables Creation Rate
Using Windows Admin Center, we were able to visualize the problem. First, we noticed that there is an high rate of new Temporary Objects being created when the stored procedure is executed. The high creation rate suggested that „Temporary Object Caching“ is not working. To reduce the impact on TempDB structures, SQL Server can cache temporary objects for reuse. Instead of dropping a temporary object since SQL Server 2005!, SQL Server retains the system metadata, and truncates the table data. Truncation reduces the storage requirement to a single (empty) data page, and the allocation information to a single IAM page. DDL commands AFTER the creation of the temporary object prevent SQL Server from caching the temporary object. The ALTER TABLE command change the meta data and the temporary object is not cachable!
SQL Re-Compilations/sec
Changing the metadata of the temporary object forces a RECOMPILE. The high number of SQL recompilations may be a direct cause of the non-cachable temporary object. The „SQL Re-Compilations/sec“ counter measures the number of times SQL Server recompiles a query or stored procedure per second. Recompilations can occur for various reasons, such as:
- Schema changes: Modifications to the database schema, like adding or dropping columns, indexes, or constraints.
- Statistics updates: Changes in table statistics that affect query execution plans.
- SET option changes: Changes to session-level SET options (e.g.,
ANSI_NULLS
,QUOTED_IDENTIFIER
). - DDL operations: Data Definition Language (DDL) statements executed within a query or stored procedure.
- Data modifications: Significant changes in the data volume of tables referenced by the query.
A high value for this counter indicates frequent recompilations, which can lead to increased CPU usage and potential performance degradation. Ideally, the number of recompilations should be relatively low compared to the total number of batch requests.
In this scenario, recompilation is performed three times when the stored procedure is executed:
- DDL Command: ALTER TABLE
- INSERT data into the temporary object
- STATS UPDATE when the predicate access the key attribute
Solution – Phase 1
Adding the primary key after the table has been created means a subsequent change to the metadata. Since Microsoft SQL Server 2014, it has been possible to define indexes as „inline indexes“ when the Temporary Table/ Table Variable is created. This means that it is ONE DDL statement and the Temporary Table object is „cachable“.
CREATE OR ALTER PROCEDURE dbo.proc_recompile
AS
BEGIN
SET NOCOUNT ON;
CREATE TABLE #x (c_custkey BIGINT NOT NULL PRIMARY KEY CLUSTERED);
INSERT INTO #x (c_custkey)
SELECT TOP (1) c_custkey FROM dbo.customers;
SELECT * FROM #x
WHERE c_custkey <= 0;
END
GO
The minimal change saved a recompile process, which has a massive impact on the overall behavior of the system. Since the temporary table is now cachable, the object is not recreated with each execution of the stored procedure, but the metadata in TempDB are used.
Since no additional write operations are performed in TEMPDB, the number of batch requests/sec has almost doubled.
Solution – Phase 2
- The empty – cached – temporary table receives new data pages. This results in a RECOMPILE
- The SELECT causes another RECOMPILE because the statistics are automatically updated
Table Variables don’t have statistics!
A key difference between Temporary Tables and Table Variables is the use of statistics. While Temporary Tables always use statistics, Table Variables cannot. The idea was to replace the temporary table with a table variable. This means that automatic statistics updates cannot be performed, since Table Variables do not have statistics.
CREATE OR ALTER PROCEDURE dbo.proc_recompile
AS
BEGIN
SET NOCOUNT ON;
DECLARE @T TABLE (c_custkey BIGINT NOT NULL PRIMARY KEY CLUSTERED);
INSERT INTO @T (c_custkey)
SELECT TOP (1) c_custkey FROM dbo.customers;
SELECT * FROM @T
WHERE c_custkey <= 0;
END
GO
Original | Phase 1 | Phase 2 | |
Total time: | 21:38.3871 | 07.37.8509 | 03.41.3636 |
avg batch requests/sec | 924 | 2.625 | 5.429 |
avg iteration | 0,0125 sec | 0,0040 sec | 0,0016 sec |
avg CPU | 0,0153 sec | 0,0024 sec | 0,0007 sec |
Are Table Variables the better solution?
It depends. Table variables have some disadvantages that should be taken into account. There are countless articles on the Internet about this. I would only use table variables in very specific scenarios:
- small number of rows (<= 100) in temporary objects
- highly transactional systems
Thank you for reading!