One of the new features in SQL Server 2008 is the ability to compress the data in your database. I’ve recently been investigating this feature and here are some of my notes on the topic. There are 2 different types of compression – Row and Page compression. In very simplistic terms, Here is an explanation of how these 2 types of compression work.
1. Row Compression
Supposing that you have a CHAR(100) field, all values saved into this field will use up 100 characters. By using Row Compression, fixed length types will be saved using a variable length format. Saving "Hello" into the field above will therefore take up 5 characters, not 100. This method of compression can be applied to all fixed length data types including int, char and float.
2. Page Compression
Page compression minimises repetition and redundancy in the data. Chad Boyd gives an excellent explanation of this:
The example above is called ‘Prefix Compression’ because it is performed on the prefixing values of each column. Following Prefix Compression, ‘Dictionary Compression’ is applied. This replaces repeated values that occur anywhere on a page and unlike row compression, it is not restricted to a single column.
Applying Compression to a table
To enable compression for a table, right click the table in Management Studio and bring up the ‘Storage’ menu item. From here, you will find an option to ‘Manage Compression…’. Selecting this item will bring up the ‘Data Compression Wizard’.
The second page in the wizard enables you to choose between either Page or Row compression. A ‘Calculate’ button on the bottom right shows you the estimated space saving based upon the selected compression type.
If you want to compress an index, you can do so by right clicking the index and starting the Data Compression Wizard from there.
The TSQL DDL for creating and altering tables is as follows:
CREATE TABLE <table_name> (<column_names>)
WITH (DATA_COMPRESSION = ROW | PAGE)
ALTER TABLE <table_name>
REBUILD PARTITION = 1 WITH (DATA_COMPRESSION = ROW | PAGE)
The TSQL for calculating the potential space savings for Row and Page compression is as follows:
EXEC sp_estimate_data_compression_savings 'dbo', 'tblTimsTestTable', NULL, NULL, 'ROW'; GO EXEC sp_estimate_data_compression_savings 'dbo', 'tblTimsTestTable', NULL, NULL, 'PAGE'; GO
Here’s an excellent article on Compression Strategies by Sunil Agarwal (http://blogs.msdn.com/sqlserverstorageengine/archive/2008/01/27/compression-strategies.aspx)
To briefly summarise his points:
- Don’t use compression if the relative space saving is small compared to the overall size of the database
- In a typical OLTP system with lots of DML activity, Row compression is more effective
- On a table with multiple indexes, compress those that are used least frequently.
SQL Server compression is a large topic and if you wish to learn more, here are a couple of links from MSDN that are very useful.