Skip to main content

SSIS Performance Comparison

While working with a client this past week, I ran into a scenario where I needed to compare two dates in a SSIS package. One date would be the same for all rows, and one date would be different for each row. There are two possible implementations: include the "same date" in all rows and do a column comparison, or put the "same date" into a variable and do a comparison between the variable and column. My first instinct was that the column comparison would be faster because the extra trip to the server would be more expensive than extra column space. As I thought about it more, I realized that as the size of the dataset increases, the time to handle the extra column might overtake the hit from the extra trip.

So I ran a test to find out!

Scenario:

I created a temp table in the AdventureWorksDW database called FactLottaInternetSales that contains multiple copies of the FactInternetSales database, but with all orderdatekeys set to 1.

I created two packages to do the comparison.
Package 1: Column.dtsx

1 Data Flow Task containing:

  • 1 OLE DB Source:
    SELECT top x *
    FROM [AdventureWorksDW].[dbo].[FactLottaInternetSales]
    WHERE orderdatekey=1
  • 1 Conditional Split: OrderDateKey <>
  • 1 Row Count (as a terminator)

Package 2: Variable.dtsx

1 Execute SQL Task which assigns the result to the variable DateKey:
select max(orderdatekey)
from adventureworksdw.dbo.FactLottaInternetSales
where orderdatekey=1

1 Data Flow Task containing:

  • 1 OLE DB Source:
    SELECT top x [ProductKey] ,[OrderDateKey] ,[DueDateKey] ,[ShipDateKey] ,[CustomerKey] ,[PromotionKey] ,[CurrencyKey] ,[SalesTerritoryKey] ,[SalesOrderNumber] ,[SalesOrderLineNumber] ,[RevisionNumber] ,[OrderQuantity] ,[UnitPrice] ,[ExtendedAmount] ,[UnitPriceDiscountPct] ,[DiscountAmount] ,[ProductStandardCost] ,[TotalProductCost] ,[SalesAmount] ,[TaxAmt] ,[Freight] ,[CarrierTrackingNumber] ,[CustomerPONumber]
    FROM [AdventureWorksDW].[dbo].[FactLottaInternetSales]
    where orderdatekey=1
  • 1 Conditional Split: @[User::DateKey] <>
  • 1 Row Count (as a terminator)

Results:




The variable package takes longer up to about 2 million rows. After that, the column package takes longer! Happy performance tuning :)


Version: SQL Server 2005 SP2

Comments

Popular posts from this blog

Reporting Services 2008 Configuration Mistake

To start working with the management side of SQL Server Reporting Services 2008, I decided to set up a report server and report manager. Unfortunately, I made a mistake while setting up my configuration that left me a little perplexed. Here are the steps I took to cause, track down, and solve the issue. Problem: I began by opening the Reporting Services Configuration Manager from the Start Menu. I clicked through each of the menu options and accepted the defaults for any question with a warning symbol, since warning symbol typically designate an action item. After two minutes, all of the warning symbols had disappeared, and I was ready to begin managing my report server. Unfortunately, opening up a browser and trying to open up the report manager resulted in the dreaded " The report server has encountered a configuration error. (rsServerConfigurationError) " message. Sherlock-ing it: I put on my sleuthing hat and went to the log file directory: C:\Program Files\Microsoft...

SQL Server 2016 versus 2014 Business Intelligence Features

Hello, SQL Server 2016 Yesterday, Microsoft announced the release of SQL Server 2016 on June 1st of this year: https://blogs.technet.microsoft.com/dataplatforminsider/2016/05/02/get-ready-sql-server-2016-coming-on-june-1st/ .  Along with performance benchmarks and a description of the new functionality, came the announcement of editions and features for the next release. Good-bye, Business Intelligence Edition The biggest surprise to me was the removal of the Business Intelligence edition that was initially introduced in SQL Server 2012.  Truthfully, it never seemed to fit in the environments where I worked, so I guess it makes sense.  Hopefully, fewer licensing options will make it easier for people to understand their licensing and pick the edition that works best for them. Feature Comparison Overall, the business intelligence services features included with each edition for SQL Server 2016 are fairly similar to SQL Server 2014.  Nothing has been "...

Is Data Science a Buzzword? aka: My first Coursera Course

Data science and data scientists are all the rage right now in the information technology space. Every company wants one; every job candidate touts they are one. But what actually does that mean to companies and potential employees? I decided to take a course on data science to see if I could find out! My co-worker, Gabriella Melki, recommended the Coursera Data Science specialization by John Hopkins Bloomberg School of Public Health. The entire specialization contains a set of 9 courses, but you can take each one individually. I started with the first course, called "The Data Scientist's Toolbox". Over the four week timeframe, I was able to view lectures and perform the assignments at my own pace. I've listed below my thoughts on the course and what I learned about data science. Week 1: Introduction to Data Science Data science is about data , specifically about answering questions, and science , following a method to discover an answer. A data scientist is the ...