Friday, June 6, 2008

SQL Server 2008 Integration Services: From D’oh to Wow

This was a class about the pros and con of using SSIS (SQL Server Integration Service)
  • Use SSIS to Load two flows in parallel instead of doing Union ALL
  • Have made improvements in SQL Server 2008’s buffering which make a significant improvement in the time to bring in large amounts of data. When caching data for the lookup component you cache entire table and that wastes memory. Fix: do not use Select *
  • In 2008, caches can also be persisted for even better performance and scale.
  • In 2005 use a View instead of a table so you pick only the columns you need to cache.
  • If querying two or more tables and you want to use the same cache, use the same column names in both queries even if all tables don’t have the same columns in it.
  • Doing to much work in SSIS often causes a serious performance hits. Joins and sort that make use of indexes in the database, are instead performed expensively in SSIS. Rows or columns which are not used may also be extracted. Push filtering, sorting and joining into the source system if possible – so long as source system resources are available.
  • Row sampling or row counts can be used as destinations. Can execute the package without the destination to determine performance bottlenecks. This is a method for testing performance without put the data on the destination.
  • Loading many flat files. Use the Multi-Source Flat-File Source Adapter. Loads similar files.
  • There are Data Mining techniques that can be used as the data is coming in.

http://bi-polar23.blogspot.com
Matthew Rocae BI/SSIS BLOG

Summary: SSIS is the upgraded version of DTS packages. Unless we want to take advance of the SSAS and Data Cubing or Data Mining I think we could continue to move our data using store procedures. I talked to the presenter of this class and he agreed.

No comments: