Sparse Columns

Madeira Team
Feb 25, 2012
3 min read

One of the basic rules when creating or altering a table is to configure and adjust the right columns properties such as data types and nullable values. It is done because we assume what kind of data those columns will contain. For example, we will not define a person’s age column with “int” data type because it will just waste unnecessary storage space. Even if we will configure the optimal data types for each column we will still face columns that holds a lot of nullable or zeroes values. Those will use storage just as if they were containing with data. SQL Server 2008 offers a new column property which is called “sparse columns”. The goal of sparse columns is to minimize storage ignoring empty columns cells. Obviously, nothing comes free. Every sparse column data will increase each column’s data storage use by four bytes. Therefore if you define a column with the data type “int” that normally uses four bytes it will now need eight bytes.

Best practice recommendations:

The main idea of this post is to use “sparse columns” only when the nullable percentage of data is significantly high (Microsoft numbers are between 20-40 percent).

Secondly, use “sparse columns” when dealing with big storage data types because then the addition of the four bytes will be less significant comparing small data types such as “bit” or “tinyint”.

Third, when we want to create a filtered index on a column that holds nullable values, the best practice is to create the column with sparse. That way the SQL server knows to ignore the nullable values and save precious storage.

Pros

Reading and writing on “sparse columns” is no different than regular columns.
We can define an additional xml “column set” that will point to the “sparse columns” and gather their data (see an example at the end of the post).

Cons

“Sparse columns” is a property of the storage layer and not apart of the table properties layer so when we perform a “select into” command the sparse property will not be copied to the new table.
There are several data types that do not support “sparse column” such as: Text, Timestamp, Geometry, image, ntext, user-defined data types.
“Sparse columns” cannot be configured on “file stream” columns.
“Sparse columns” cannot be configured on a computed columns.
“Sparse columns” cannot have default values.
“Sparse columns” cannot be configured on clustered index columns.

Examples

In order to present and examine the use of “sparse columns” I created two identical tables. One with “sparse columns” and the other without “sparse columns”.

Then I inserted into the two tables 50,000 rows with nullable values.

Then, I called the ” sp_spaceused ” procedure to see the differences between the results. It is easy to see that the table with the “sparse columns” used much less storage space.

Transact-SQL

CREATE TABLE UnSparsed_Table( ID INT IDENTITY(1,1), ColX INT, ColY VARCHAR(100), ColZ DateTime) GO CREATE TABLE Sparsed_Table(ID INT IDENTITY(1,1), ColX INT SPARSE, Coly VARCHAR(100) SPARSE, ColZ DateTime SPARSE) GO DECLARE @iterator INT = 0 WHILE @iterator < 50000 BEGIN INSERT INTO UnSparsed_Table VALUES (NULL,NULL, NULL) INSERT INTO Sparsed_Table VALUES (NULL, NULL, NULL) SET @iterator+=1 END GO sp_spaceused 'UnSparsed_Table' GO sp_spaceused� 'Sparsed_Table' GO DROP TABLE UnSparsed_Table GO DROP TABLE Sparsed_Table GO

CREATE TABLE

UnSparsed_Table(

ID INT IDENTITY(1,1),

ColX INT,

ColY VARCHAR(100),

ColZ DateTime)

CREATE TABLE

Sparsed_Table(ID INT IDENTITY(1,1),

ColX INT SPARSE,

Coly VARCHAR(100) SPARSE,

ColZ DateTime SPARSE)

DECLARE

@iterator INT = 0

WHILE @iterator < 50000

BEGIN

INSERT INTO

UnSparsed_Table VALUES (NULL,NULL, NULL)

INSERT INTO

Sparsed_Table VALUES (NULL, NULL, NULL)

SET

@iterator+=1

END

sp_spaceused

'UnSparsed_Table'

sp_spaceused�

'Sparsed_Table'

DROP TABLE

UnSparsed_Table

DROP

TABLE Sparsed_Table

In order to present and examine the use of “column set” I created a table with two “sparse columns” and another XML column. Then I inserted one row into the table and selected it to see the results. As you can see the XML column holds the “sparse columns” data.

Transact-SQL

CREATE TABLE Example2( ColI int SPARSE, ColJ int SPARSE, XMLC xml column_set FOR ALL_SPARSE_COLUMNS) GO INSERT Example2(ColI, ColJ) VALUES (1,2) GO SELECT XMLC FROM Example2 GO