Skip to main content

Posts

Showing posts from July, 2010

More on high performance data loading

Today I stumbled across what is possibly the most comprehensive, useful page on the net about bulk loading data into SQL Server 2008 . While it is packed with useful information, one of the pearls is about trace flag 610. In a nutshell, trace flag 610 instructs the server to minimally log all operations that can be minimally logged. In practical terms, this means that MERGE and INSERT operations will minimally log new page allocations . Without trace flag 610, these operations are fully logged. The link (above) goes into great detail about the usage of trace flag 610, so I'll simply summarize as follows: DBCC TRACEON (610);               -- enables trace flag 610 MERGE INTO xxx USING yyy ON       -- do minimally logged upsert DBCC TRACEOFF(610);               -- restore normal mode. Note that setting trace flag 610 will affect all connections to the server until it is reset, so avoid using it on main OLTP servers. For a usable T-SQL MERGE example, see Sunil's blog posting .

SSIS - High Performance Data Warehouse Loading

In a previous post I mentioned how to achieve data transfers at near file-copy speeds using SSIS. In this follow-on, I’m going to show you how to put this information to practical use for quickly loading a data-warehouse. Generally speaking, the idea behind data warehousing is to reduce the load on transactional systems by providing a central repository of which has been normalized in a manner that is ideal for drawing reports and analytics. Strategies for loading data warehouses typically fall into one of two categories: 1.       Periodically resetting and repopulating the data warehouse during off-peak times, 2.       Performing an initial data load, followed by on-going incremental loads. Significant changes to any transactional systems usually necessitate changes to a data warehouse or, at least, changes to how the data warehouse is populated, which means that no matter which strategy you follow, you’ll benefit from being able to achieve fast full-reloads of the data

SSIS: Check List for Minimally Logged Inserts

Over the coming weeks I’ll be presenting a series of posts on advanced techniques for achieving high performance SSIS data loads. In this post, we’ll focus on a brief check-list for achieving high speed data inserts into a SQL Server target table. There are many disparate sources, but I’ve yet to find a single checklist of everything you need to consider for achieving minimally logged insert operations. The prerequisite for achieving data loads at speeds comparable to a file copy ultimately come down to two main things: 1. Insert operations need to be minimally logged. 2. Inserted data needs to be sorted according to the target table’s clustered index (primary key). When insert operations are minimally logged, it means that the transaction log is bypassed during the insert operation – i.e. only the MDF (data) file is written to, instead of the MDF and LDF file/s. When the inserted data is sorted according to the target table’s primary key, it means that SQL server does not need