While working with a client this past week, I ran into a scenario where I needed to compare two dates in a SSIS package. One date would be the same for all rows, and one date would be different for each row. There are two possible implementations: include the "same date" in all rows and do a column comparison, or put the "same date" into a variable and do a comparison between the variable and column. My first instinct was that the column comparison would be faster because the extra trip to the server would be more expensive than extra column space. As I thought about it more, I realized that as the size of the dataset increases, the time to handle the extra column might overtake the hit from the extra trip. So I ran a test to find out! Scenario: I created a temp table in the AdventureWorksDW database called FactLottaInternetSales that contains multiple copies of the FactInternetSales database, but with all orderdatekeys set to 1. I created two packages to do the com...
Just a girl that loves data