Skip to main content

Manipulating Excel Spreadsheets in SSIS

Tom, an attendee at last weekend’s SQLSaturday Olympia, asked me how to refresh a spreadsheet from within SQL Server Integration Services. My first thought was to turn on the connection’s “Refresh data when opening the file” option in the spreadsheet itself and avoid the situation entirely; however, this may not always be a viable solution. Here are the steps to perform the refresh from within an SSIS package.

First, ensure that Microsoft.Office.Interop.Excel is registered in the GAC. If not, install the 2007 Microsoft Office system Primary Interop Assemblies. This will need to be done on any machine where you plan on running this package.

Next, create a script task in your SSIS package that contains the following code (include your spreadsheet name):

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports Microsoft.Office.Interop.Excel

Public Class ScriptMain

Public Sub Main()
Dts.TaskResult = Dts.Results.Success

Dim excel As New Microsoft.Office.Interop.Excel.Application
Dim wb As Microsoft.Office.Interop.Excel.Workbook

wb = excel.Workbooks.Open("C:\\TestExcelSS.xlsx")
wb.RefreshAll()
wb.Save()
wb.Close()

excel.Quit()
Runtime.InteropServices.Marshal.ReleaseComObject(excel)

End Sub

End Class

You'll see error squiggles, but don't worry about them because they will disappear in just a minute. Save and close your package. In your Solution Explorer, right click on the package and select ‘View Code’.

In the resulting XML, change the Build Settings ReferencePath property to:
"C:\Windows\assembly\GAC_MSIL\Microsoft.SqlServer.ScriptTask\9.0.242.0__89845dcd8080cc91\;C:\Windows\assembly\GAC_MSIL\Microsoft.SqlServer.ManagedDTS\9.0.242.0__89845dcd8080cc91\;C:\Windows\assembly\GAC\Microsoft.Office.Interop.Excel\12.0.0.0__71e9bce111e9429c\"

Also change the Build References to include:
<Reference
Name = "Microsoft.Office.Interop.Excel"
AssemblyName = "Microsoft.Office.Interop.Excel"
/>

Save the XML, and reopen the package. Open the script task and select ‘Save’. This will compile the code, and now you can run your package.

When working with COM references, you can use the script task GUI to add the reference by adding the desired component to the .NET framework folder. I could not find Microsoft.Office.Interop.Excel.dll on my machine to move to the framework folder, which is why we added the reference through the XML.

As Douglas Laudenschlager notes, writing server-side code to access client-side Office is unsupported. Please take these possible problems under advisement and code as necessary. You have been warned. :)

Update (11/12/08): Added last two lines to code to stop Excel process.

Versions: Microsoft Office 2007, SQL Server 2005 SP2

Comments

Anonymous said…
Hi Jessica,

After I ran the code, in my Windows Task Manager Excel.exe is opened as many times as I run the code, how can I get this code to close Excel.exe.? I tried excel.quit() but it didn't help.

Regards,

Christian
Jessica M. Moss said…
Hi Christian,

If you add this snippet of code after the existing script, it removes the current process from the Task Manager:

excel.Quit()
Runtime.InteropServices.Marshal.ReleaseComObject(excel)

I'm not familiar with how the management of COM objects work, so I'm not sure if there are any aftereffects, but it appears to work.

Jessica
Unknown said…
Hi Jessica,

Do you know how can I use Excel functions inside of SSIS ?

My situation is like this:
I have to import an Excel spreadsheet into SSIS.
Then I have to find the 50th and 75th percentile of the values in a certain column.
Based on the 2 percentile values, I have to create a derived column.

This has to be be done inside a package in SSIS.

In MS Excel, I could use the Percentile function to get the percentile values. But how can I do the same thing inside SSIS ?


Thanks and Regards.
Jessica M. Moss said…
Hi Vivek,

To figure out a percentile within an SSIS package, you would want to either use VB.NET code in a script task or an Execute SQL Task after loading the data into a temporary table. Keep in mind that percentiles (http://en.wikipedia.org/wiki/Percentile) are not always calculated the same way, but here is an example that someone created for SQL: http://www.sqlteam.com/article/computing-percentiles-in-sql-server

Good luck!
Jessica
Anonymous said…
Hi Jessica,

Thanks for the help. The problem is that I need to hide two sheets after I refreshed my Workbook:
wsListOfItems = CType(wb.Sheets("ListOfItems"), Microsoft.Office.Interop.Excel.Worksheet)
wsAddress = CType(wb.Sheets("Address"), Microsoft.Office.Interop.Excel.Worksheet)
wsListOfItems.Visible = XlSheetVisibility.xlSheetHidden
wsAddress.Visible = XlSheetVisibility.xlSheetHidden

This lines of code keeps Excel.exe open in memory(Windows Task Manager).

Another question that I want to ask is, how come when I use wb = excel.Workbooks.Open("C:\\TestExcelSS.xls") it enables the Macros as well?

Regards,

Christian
Jessica M. Moss said…
Hi Christian,

We're starting to get into Excel programming questions, which is not my area of expertise. Here's a good source of information for Excel programming: http://msdn.microsoft.com/en-us/library/bb687921.aspx or you can try posting your question on the associated newsgroup: http://www.microsoft.com/communities/newsgroups/en-us/?dg=microsoft.public.excel.programming

Good luck!
Jessica
Anonymous said…
Followed the sample, but I get the error 'Microsoft.Office.Interop.Excel.Application' is not defined. The only difference is that my interop assembly is verion 11.0.0.0 (this shows on the GAC)

I can see "Microsoft.Office.Interop.Excel" under Reference in VS for Application, but when I run the task it gives me that error. Any idea?
Jessica M. Moss said…
Hi ColeR,

Because you're working with an earlier version of Excel, you'll need to change the reference. Replace C:\Windows\assembly\GAC\Microsoft.Office.Interop.Excel\12.0.0.0__71e9bce111e9429c\ with your version (I don't have that specific file to tell you what it should be). Make sure you modify the "71e9bce111e9429c" to match your file as well.

-Jessica
Anonymous said…
Hi Jessica,

This code does not work for us as it saves the file before the refresh is finished. When using excel 2003 we used the xlSheet.QueryTables(1).Refresh(False) method and it worked fine as long as your query was set up using excel 2003 or earlier version. By setting the false it stays on that line of code until the refresh is finished. Now if we set up a new query using Excel 2007 this line of code breaks.

Do you have any ideas?
Jessica M. Moss said…
Hello,

Here are a couple of things to try... The first is that you might want to look at setting PivotCache.BackgroundQuery = False on any pivot tables you have - those are set to run in the background, so may be where the issue lies. The other is to try inserting a Thread.Sleep() for a few milliseconds/seconds after you call the Refresh for debugging purposes.

HTH,
Jessica
Anonymous said…
Hi Jessica,

I just want to know. I'm using Excel2007 and I have an excel macro enable file, *.xlsm, which needs updating. Will this code refresh the data as well?

Regards

Christian
Jessica M. Moss said…
Hi Christian,

According to this article, http://msdn.microsoft.com/en-us/library/bb223574.aspx, pivot tables and external data ranges are the objects that are refreshed.

Jessica
VeriS Liem said…
If you use Thread.Sleep() call the package from command prompt instead of from SQL Agent. You can schedule it using Windows Schedule Tasks instead of SQL Job.
Unknown said…
hi, i need to create a small gui app that will import different xls files into different sql server 2008 tables based on a 3 character code. do you know if anything like that exists or what's the best way to build it? thanks.
carminenatale@gmail.com
Anonymous said…
Hi Jessica,

I've got 97-2003 excel file with several tabs in it and some pivot tables also.Here I need to refresh this excel file using SSIS package. I saw you are code for 2007 Excel file it would be great help if you could guide me for 97-2003 excel file refresh through SSIS package. Thanks
Unknown said…
I know this is an old post and this is a dumb question, but if you install PrimaryInteropAssembly.exe on your SQL Server, do you still have to install MS Office on it? My guess is yes since it invokes Excel objects that I do not believe comes with the 6 MB install (PrimaryInteropAssembly.exe), but I need a second opinion to convince my DBA. Thank you.
Unknown said…
For the anonymous question before my last comment, there is PIA for Office 2003 as well. Please refer to this link.

http://msdn.microsoft.com/en-us/library/aa159923%28office.11%29.aspx
Jessica M. Moss said…
Hi Joshua C,

I tested my code on a machine that had Office installed, so I cannot say for sure. I would recommend testing this on a clean machine to check.

-Jessica
Thank you - this was *exactly* what I needed to automatically refresh the data in a large collection of Excel files. It worked beautifully. Chris Nelson - Ascentium Corp.
Anonymous said…
Hi Jessica.

Your code is awesome. I used it to create a generic excel file refresh ssis package - just enter the path, filename, password, etc. into variables and it runs, refreshes, and sends an email with the results (stored proc). Awesome!

BUT - sometimes it hangs/exits, and the workbook remains open AND locked to me! Is there a way in code to close it down?

I have this code at the bottom of my sub, added a couple things I thought might work to yours, but doesn't seem to cut it:


On Error Resume Next

'close file without save
wb.Close(False)

'destroy excel object
wb = Nothing
excel.Workbooks(Filename).Close(False)
excel.Quit()
Runtime.InteropServices.Marshal.ReleaseComObject(excel)


Thanks,
Devo
Anonymous said…
Christian.. I faced the same problems (hanging) regarding closing excel after the dtsx execution.

You can find the solution here:
http://devcity.net/PrintArticle.aspx?ArticleID=239
specifically the third option...

Javier
Anonymous said…
Thanks Javier - just what I needed.

I took the code below and inserted it at the beginning of my code AND at the end (where errors go to).

So if you have anything open in Excel it shuts it down, before AND after trying to refresh the file.


Dim proc As System.Diagnostics.Process

'kill all instances of excel
For Each proc In System.Diagnostics.Process.GetProcessesByName("EXCEL")
proc.Kill()
Next


Note: it seems to cause issues if you run it AFTER the dim excel objects code. So, put it before this:

Dim excel As New Microsoft.Office.Interop.Excel.Application
Dim wb As Microsoft.Office.Interop.Excel.Workbook
Dim cn As Microsoft.Office.Interop.Excel.WorkbookConnection


and after this:


wb.Close(False)


-Devo
Unknown said…
Hello guys again. I have to migrate my code to a 64 bits environment. My question is, do you know if there are some known problems with PIA in 64 bits?

Thanks in advance,

Javier
Jessica M. Moss said…
Hi cucho,

I haven't had to implement this on a 64 bit box yet. I know the 2010 version can work in WoW mode, so that may give you something to work with! http://msdn.microsoft.com/en-us/library/15s06t57(VS.100).aspx

Jessica
Anonymous said…
Hi Jessica,

I currently have two versions of excel installed 2003 and 2007. I am trying to run your code and it defaults to excel 2003, I would like to have it open in 2007 because the excel sheet I have has more than 256 columns. Any insight or suggestion would be helpful.

Thanks
Jessica M. Moss said…
Hi Anonymous,

Differentiating between version of Excel is getting out of my realm of expertise. I would suggest looking through Books Online or trying the MSDN forums for Excel for help!

Jessica
Anonymous said…
We used the above code when using SQL Server 2005, but now we are moving to SQL Server 2008 and it doesn't recognize the Dts. code. Any ideas?
Anonymous said…
We are also migrating to SQL Server SSIS 2008 and the Interop Assemblies will not save in the registry if we use vb 2008. Are there newer dll's that we can't find?

This doesn't happen with C# but I am not sure how to translate this code in C# since I have not used it before.

Thanks for any help.
Jessica M. Moss said…
Hi Anonymous #1,

Dts is still the object to use in 2008. Ensure you're using the script task, not the script component.

Hi Anonymous #2,

Using VB.NET shouldn't affect using the Interop Assemblies, so you may want to look more into that. Other than that, you can try converting code by using a website similar to this: http://converter.telerik.com/

HTH,
Jessica
Anonymous said…
use USING statement to help release/dispose the excel object.

ex.

USING

END USING
Anonymous said…
Thank you Jessica, it helped
Anonymous said…
Hello Jessica,

I have an Excel file with macro (Excel 2003 sp 3) that I want to execute thru SSIS. Now I have a job that calls this SSIS package that won't work. If I run the package from BIDS, it works. If I run the package from MSDB > right-click > run package, it works. If I open the Excel, the macro works as well. But not when the job calls it. The system user that runs the SQL Agent is admin in that machine (Windows Server 2003 R2 SP2 64bit).

Help?

Thanks,
Jaynne
Anonymous said…
I forgot to mention that when the SQL Job runs the package, I can see in Task Manager that the batch file is launched (cmd.exe) and Excel file is opened (EXCEL.exe). But it just stays like that, nothing happens, so I had to end task.

Thanks again,
Jaynne
Anonymous said…
This was exactly what I needed. Thank you!
Jessica M. Moss said…
Hi Jaynne,

It sounds as though there is a permission issue. Try using the advice in this KB article: http://support.microsoft.com/kb/918760 . If that doesn't work, you may want to try posting the symptoms to the MSDN forum.

Good luck!
Jessica
Darren Stanger said…
Jessica,

Regarding the instructions below, would you please post the XML from View Code? I'm having trouble finding the Build Settings ReferencePath because I seldom develop in .net.

Thanks,
Darren Stanger, Certiport, Inc.

In your Solution Explorer, right click on the package and select ‘View Code’.

In the resulting XML, change the Build Settings ReferencePath property to:
"C:\Windows\assembly\GAC_MSIL\Microsoft.SqlServer.ScriptTask\9.0.242.0__89845dcd8080cc91\;C:\Windows\assembly\GAC_MSIL\Microsoft.SqlServer.ManagedDTS\9.0.242.0__89845dcd8080cc91\;C:\Windows\assembly\GAC\Microsoft.Office.Interop.Excel\12.0.0.0__71e9bce111e9429c\"

Also change the Build References to include:
Jessica M. Moss said…
Hi Darren,

The XML will be different for every package due to the identifiers of the different elements. I'm afraid it wouldn't help at all. Try performing a file search for "ReferencePath".

Good luck!
Jessica
Niall said…
Hi Jessica,
Thanks heaps for posting your code - it was exactly what I needed.

IOU 1 * Pizza :)

cheers,
Niall
Anonymous said…
I have an Excel file with macro (Excel 2003) that I want to execute thru SSIS. Now I have a procedure that calls this SSIS package that won't work. If I run the package from BIDS, it works. If I open the Excel, the macro works as well. But not when the procedure calls it and inturn the excel process gets blocked. The system user that runs the SQL Agent is admin in that machine.
Any help in this regard would appreciate. Thanks in advance.
Anonymous said…
Jessica:

Any update regarding my yesterday's query? It would be great if you can provide us clarification. Thanks in advance.

Regard,Krishna
Jessica M. Moss said…
Hi Krishna,

I would start to tackle it from the angle of permission errors that occur when executing an SSIS package from SQL Agent. Try these troubleshooting options: http://support.microsoft.com/kb/918760

Jessica
Anonymous said…
Jessica,

I found your post to be very helpful, but it appears it is for SSIS 2005 and not 2008. Is this true?

We are running SSIS 2008.

Your instruction:
"In the resulting XML, change the Build Settings ReferencePath property to:" cannot be found in the XML code in my SSIS 2008 package after making the script task.

How can I accomplish this same thing you instructed in a 2008 package? Can it be done?

Thanks,

David
Kiran Ramaswamy said…
Old thread I know, but helped me quite a bit. Thanks!
laurens said…
Jessica M. Moss said...

Hello,

Here are a couple of things to try... The first is that you might want to look at setting PivotCache.BackgroundQuery = False on any pivot tables you have - those are set to run in the background, so may be where the issue lies. The other is to try inserting a Thread.Sleep() for a few milliseconds/seconds after you call the Refresh for debugging purposes.
----------------------

Thank you Jessica!! After hours of surfing the web... Well, I found you blog post as the first reference but did not went deep enough in the comments.
Thread.sleep did it for me cause I had troubles with a 'too early close' in Excel 2007/2010.

My code:
oApp.DisplayAlerts = False
wb.RefreshAll()
Threading.Thread.Sleep(10000)
wb.Save()
wb.Close()
Anonymous said…
Hi Jessica... I'm running a macro stored in an excel sheet via SSIS. All I'm doing is opening the excel sheet (.xlsm file) running the macro by running the code ExcelObject.Run("Macro1"). However, its returning a COM release error and unable to release excel process. If I simply open and close the the excel sheets it works fine. Only when I try to try to run the macro (which takes about 5-10 mins to complete running) it gives me the error. Please help as its a critical piece of automation for me.
P.S. - I have already installed Office 2010 Interlop assemblies

Thanks in advance,
Aanjaney
Anonymous said…
You dont need to have a thread that sleeps if you disable all of the background updates from the connections in the excel files.
Ganesh said…
Hi,

I am trying to use Excel Interop with SSIS and i am getting an error "unable to load Interop" when i run the package on a 64bit SQL Server 2008.

Do i have to install excel on the server to use excel interop?

Thanks.
Amar said…
Hi Jessica, I tried your solution and two things. I am unable to find the build settings reference section in the XML of the package and two when i try to run the script task ssis throws the following error:

at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeType typeOwner)
at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, Signature sig, MethodAttributes methodAttributes, RuntimeType typeOwner)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
at System.RuntimeType.InvokeMember(String name, BindingFlags bindingFlags, Binder binder, Object target, Object[] providedArgs, ParameterModifier[] modifiers, CultureInfo culture, String[] namedParams)
at Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTATaskScriptingEngine.ExecuteScript()

I am trying to the very simple task of opening an macro enabled excel file and refreshing all data and closing it. my code is as shown below:

public void Main()
{
Dts.TaskResult = (int)ScriptResults.Success;

Microsoft.Office.Interop.Excel.Application oExcel = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook oBook = new Microsoft.Office.Interop.Excel.Workbook();

String sDataSheet = (String)Dts.Variables["User::ExcelOutputFilePath"].Value;

oBook = oExcel.Workbooks.Open(@"strDataSheet");

oBook.RefreshAll();
oBook.Save();
oBook.Close();
oExcel.Quit();

System.Runtime.InteropServices.Marshal.FinalReleaseComObject(oExcel);


}

Please help.
Anonymous said…
Hi Jessica, I tried your solution and I got two errors and one warnning:
Error: 0x4 at Script Task: The binary code for the script is not found.
Error: 0xC0024107 at Script Task: There were errors during task validation.
Warning: 0x80019002 at Package_REFRESH_EXCEL_FILE: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED.



I run like admin
the system is win 7/64bit with SQL 2008 r2
I had added the Microsoft.Office.Interop.Excel without problem
The file is share locally
I am using the same code


What could be am missing?

need help
thanks
Anonymous said…
IS a command line to disable alert message ??
Anonymous said…
How do you handle the refresh of the Excel file if it connects to SQL Server. Windows Authentication is not used to login to SQL Server and there is a window prompt "SQL Server Login" for the "Login ID" and "Password". Thanks, Angel
Jessica M. Moss said…
Comments are now closed on this post.

Popular posts from this blog

Reporting Services 2008 Configuration Mistake

To start working with the management side of SQL Server Reporting Services 2008, I decided to set up a report server and report manager. Unfortunately, I made a mistake while setting up my configuration that left me a little perplexed. Here are the steps I took to cause, track down, and solve the issue. Problem: I began by opening the Reporting Services Configuration Manager from the Start Menu. I clicked through each of the menu options and accepted the defaults for any question with a warning symbol, since warning symbol typically designate an action item. After two minutes, all of the warning symbols had disappeared, and I was ready to begin managing my report server. Unfortunately, opening up a browser and trying to open up the report manager resulted in the dreaded " The report server has encountered a configuration error. (rsServerConfigurationError) " message. Sherlock-ing it: I put on my sleuthing hat and went to the log file directory: C:\Program Files\Microsoft...

Execute SQL Task Designer Limit

After migrating a package from DTS to SSIS, I had a problem with an Execute SQL Task. I couldn't change any characters in the SQLStatement property; I couldn't add any new characters; I could delete characters, but not retype them! After googling several variations of "integration services" "read only" and "Execute SQL Task", I deleted about half of the entry in a fit of frustration. Lo and behold, I could type again. Apparently, there is limit on the size or number of characters that can be entered in the SQLStatement property. From my experimentation, I came up with a limit of 32767 characters. The interesting thing is that the restriction only seems to be on the designer. If you set the SourceType to "Variable" and use a variable that contains more than 32767 characters, the task will execute. Also, if you use the "Direct Input" SourceType and modify the package XML to set the SQLStatement longer than 32767 characters, ...