Retrieving the changes from the CDC tables is very easy also. Microsoft has provided a set of functions that retrieve the data, so all you have to do is call them. As you would expect, both Type 1 and Type 2 data can be stored, so you can have the exact level of change data you want.
Nonlogged inserts allow SQL Server to do minimal logging for large inserts. By not logging every data row inserted, you can speed up your data loads by orders of magnitude in some cases. At the very least, you will see significant performance gains.
You could do nonlogged inserts in previous versions of SQL Server using what's known as a "select into" statement, but "select into" actually creates the table for you. This is great for new tables, but if you have existing tables with security and other attributes defined, you don't want to delete and re-create them. Another problem with "select into" is that you have no control where your new table goes; it always joins the default file group.
The new nonlogged insert allows you to insert the rows into an existing table, maintaining the control you have over your space, performance, and security, and still get the benefit of an ultra-fast data load. The general rule of thumb I like to follow for warehouses (and any database, come to think of it) is no moving schema. That means you don't want to delete and re-create permanent objects every day. It's error prone and introduces complexity into your system you don't need, not to mention your tables aren't available during the load because they've been deleted.
In my testing, non-logged inserts performed right on par with their "select into" cousins. As the DBA of a large warehouse that faces data loading problems every day, I love this feature. There's just one thing I would change about it: I wish there were an option I could tack onto my insert statement to throw it into non-logged mode.
SQL Server's first attempt at Data Compression actually looks pretty good. As I noted in my beta preview, SQL Server 2008 provides two types of compression: row and page. Row compression is true compression, in which unused spaces at the ends of columns are removed to save storage. Page compression, aka dictionary compression, normalizes the data on each page and keeps a lookup pointer. In SQL Server 2008, page compression includes row compression. If you have page compression turned on, you get row compression in the bargain.
Microsoft provides a handy compression calculation wizard that will give you a good estimate of the benefits you can expect. The wizard runs a test compression scenario against your data for each compression type (row and page) and tells you what the new size of the table should be. I tested the compression calculator against a number of data sets, and on average the calculation deviated from my final results by only 1 or 2 percent. That's pretty good, considering that the calculation is based on a relatively small amount of data.
Sign up for CIO Asia eNewsletters.