Data compression in SQL Server can definitely go a long way in saving storage space, as well as improve performance. It can especially be significant in very large databases. But in certain scenarios, it can actually cause more harm than good, and it’s important to identify these scenarios and distinguish between the efficient - and the harmful.
Data Compression has been available in SQL Server since version 2008. If you have SQL Server 2016 SP1 or newer, it should be available even in Standard Edition installations. In earlier versions, it's only available in Enterprise Edition (which means that it's one more reason to upgrade if you don't have an Enterprise Edition!).
Improved Storage Disk Utilization
Successful data compression lets you save more data rows per each data page. More data rows per each page mean fewer operations needed to read or write data to your storage disk. So, not only would your disk space utilization be improved, but also your storage performance.
Improved Buffer Cache Utilization
It doesn't end there, though, as SQL Server retains the same data compression in RAM as well. This means that SQL Server would be able to store more data in the buffer cache, possibly saving you money on RAM upgrades and also improving performance even further - because when the data is in the buffer, SQL Server would have to access the storage disk even less... And the storage disk, as we all know, is the slowest piece of hardware in your server.
The Price: Increased CPU Usage
The price to pay, though, comes in the form of increased CPU utilization. This is because SQL Server would have to automatically compress and decompress every data page that it reads or writes. On average, you should expect an additional 5% to 15% increase in CPU utilization.
The Numbers Game
On average, data compression in SQL Server can reduce your data size by 40%. In some cases, it could be so effective that you could achieve as much as a 90% reduction in size! But, if you try compressing the wrong table or index, you could compress absolutely nothing, and yet still have to pay the price in the form of higher CPU utilization.
Here's a Script
This is why I created the T-SQL script linked below. It performs compression savings estimation checks for an ENTIRE database, taking into consideration everything that needs to be taken into consideration - table sizes, column data types, rates of updates vs. inserts vs. selects, and best of all - it even generates the relevant remediation scripts for you!
Due to the proven effectiveness of this script and the benefits involved, we started using it as part of our Data Architecture Review service.
Comentarios