Skip to main content

Pipeline Parallelization

One of the new features of SQL Server Integration Services 2008 is the parallelization of the pipeline. This is implemented by creating multiple execution threads that can be run on different processors, increasing the performance of the package. In SSIS 2005, this separation of threads could be emulated by using a Union All transformation to create a new buffer.

If you turn on logging in the data flow, you can see these separate execution trees in the events PipelineExecutionPlan and PipelineExecutionTrees. I created a simple package that creates a row set of integers, multicasts to a conditional split (as a terminator) and a flat file destination.

Here are the messages from those events:
User:PipelineExecutionTrees
Begin Path 0
output "Output 0" (32); component "Script Component" (29)
  input "Multicast Input 1" (53); component "Multicast" (52)
  Begin Subpath 0
    output "Multicast Output 1" (54); component "Multicast" (52)
    input "Flat File Destination Input" (64); component "Flat
      File Destination" (63)
  End Subpath 0
  Begin Subpath 1
    output "Multicast Output 2" (69); component "Multicast" (52)
    input "Conditional Split Input" (57); component
      "Conditional Split" (56)
  End Subpath 1
End Path 0


User:PipelineExecutionPlan
Begin output plan
  Begin transform plan
  End transform plan
  Begin source plan
    Call PrimeOutput on component "Script Component" (29)
      for output "Output 0" (32)
  End source plan
End output plan

Begin path plan
  Begin Path Plan 0
    Call ProcessInput on component "Multicast" (52)
      for input "Multicast Input 1" (53)
    Create new execution item for subpath 0
    Create new execution item for subpath 1
    Begin Subpath Plan 0
      Call ProcessInput on component "Flat File Destination" (63)
        for input "Flat File Destination Input" (64)
    End Subpath Plan 0
    Begin Subpath Plan 1
      Call ProcessInput on component "Conditional Split" (56)
        for input "Conditional Split Input" (57)
    End Subpath Plan 1
  End Path Plan 0
End path plan


You can see that the new log creates a subpath for both the flat file destination path and the conditional split path. The best part is that each subpath can then be executed on different processors, increasing the speed of the package!

Version: SQL Server 2008 CTP6

Comments

Popular posts from this blog

SQL Server 2016 versus 2014 Business Intelligence Features

Hello, SQL Server 2016 Yesterday, Microsoft announced the release of SQL Server 2016 on June 1st of this year: https://blogs.technet.microsoft.com/dataplatforminsider/2016/05/02/get-ready-sql-server-2016-coming-on-june-1st/ .  Along with performance benchmarks and a description of the new functionality, came the announcement of editions and features for the next release. Good-bye, Business Intelligence Edition The biggest surprise to me was the removal of the Business Intelligence edition that was initially introduced in SQL Server 2012.  Truthfully, it never seemed to fit in the environments where I worked, so I guess it makes sense.  Hopefully, fewer licensing options will make it easier for people to understand their licensing and pick the edition that works best for them. Feature Comparison Overall, the business intelligence services features included with each edition for SQL Server 2016 are fairly similar to SQL Server 2014.  Nothing has been "...

Is Data Science a Buzzword? aka: My first Coursera Course

Data science and data scientists are all the rage right now in the information technology space. Every company wants one; every job candidate touts they are one. But what actually does that mean to companies and potential employees? I decided to take a course on data science to see if I could find out! My co-worker, Gabriella Melki, recommended the Coursera Data Science specialization by John Hopkins Bloomberg School of Public Health. The entire specialization contains a set of 9 courses, but you can take each one individually. I started with the first course, called "The Data Scientist's Toolbox". Over the four week timeframe, I was able to view lectures and perform the assignments at my own pace. I've listed below my thoughts on the course and what I learned about data science. Week 1: Introduction to Data Science Data science is about data , specifically about answering questions, and science , following a method to discover an answer. A data scientist is the ...

Manipulating Excel Spreadsheets in SSIS

Tom, an attendee at last weekend’s SQLSaturday Olympia , asked me how to refresh a spreadsheet from within SQL Server Integration Services. My first thought was to turn on the connection’s “Refresh data when opening the file” option in the spreadsheet itself and avoid the situation entirely; however, this may not always be a viable solution. Here are the steps to perform the refresh from within an SSIS package. First, ensure that Microsoft.Office.Interop.Excel is registered in the GAC. If not, install the 2007 Microsoft Office system Primary Interop Assemblies . This will need to be done on any machine where you plan on running this package. Next, create a script task in your SSIS package that contains the following code (include your spreadsheet name): Imports System Imports System.Data Imports System.Math Imports Microsoft.SqlServer.Dts.Runtime Imports Microsoft.Office.Interop.Excel Public Class ScriptMain Public Sub Main() Dts.TaskResult = Dts.Results.Success Dim excel...