How much will a table compress? That depends on both the type of data and the type of compression. If your data is something like SAP data, which has a lot of trailing spaces, then row-level compression will serve you well. If you have a lot of repeating data, then table-level page compression will be your best friend.
The order of your data matters also. Because SQL Server compresses at the page level, repeating data that is clustered together will be de-duplicated. So it pays to be aware of both the nature and the order of your data.
All that said, just as with Advanced Compression in Oracle Database 11g, my results with Data Compression in SQL Server 2008 were all over the map. I got as little as 17 percent compression on some data sets, and as much as 76 percent on another. The high of 76 percent was reached on denormalized order line data in a data warehouse.
When it comes to Data Compression, SQL Server has one advantage over Oracle: SQL Server will compress existing tables in a single statement. In Oracle Database 11g, you have to create a new table, insert the data, and then delete the old table. Not only does SQL Server spare you from creating an additional copy of the data, but you can still perform operations on the table during the compression operation, no downtime required.
An excellent new feature called Sparse Columns allows you to store null values without taking up any physical space. But here Microsoft made a big mistake: Sparse Columns aren't compatible with Data Compression. If you define Sparse Columns on a table, you can't also apply Data Compression. As I mentioned in my preview, this one may be worthy of a Darwin Award. Sparse Columns and Data Compression would be a perfect match. Let's hope Microsoft fixes this soon.
Backup Compression too
SQL Server 2008 also introduces Backup Compression. In some preliminary tests, I found the compression ratios to be mostly on par with those of Quest LiteSpeed and other third-party backup tools I've used. Still, SQL Server's Backup Compression has a number of strikes against it. First, Backup Compression is only available for the enterprise edition. Even in the enterprise, that's just going to make things harder to manage. Chances are the lion's share of SQL Server instances in any sizable organization aren't enterprise edition, and for these instances to be the exception in the backup plan just isn't acceptable.
Look at it like this: Database size isn't the reason you've chosen enterprise edition. Because most versions of SQL Server can handle unlimited data size, enterprise edition is generally chosen for the features. As a result, you're likely to have plenty of SQL Server boxes in your shop with very large data sets that aren't on enterprise, and they need compressed backups too, which means you probably already have a third-party backup solution in place. To switch backup routines for a small subset of your SQL Server boxes just doesn't make sense. You'll surely want to have the same solution for your entire environment.
Sign up for CIO Asia eNewsletters.