Performance

879 views
Skip to first unread message

Sergey Zhilyakov

unread,
Mar 27, 2012, 2:31:13 PM3/27/12
to Excel-DNA
Hi Govert,

I would like to ask you concerning the performance of the call of
ExcelDna functions.

I have ExcelDna functions that work fine. I also have C# COM Addin
(not ExcelDna) which inherits the IDTExtensibility2 interface. This
COM Addins recalculates a worksheet as follows:

void Recalculate()
{
ExcelApplication.ActiveSheet.Calculate();
}

where ExcelApplication is the 'Application' instance.

When I profile the code, I noticed that the ExcelDna function is quite
fast. But the 'Calculate' method takes much more time.

Here is profile results: http://i40.tinypic.com/qqo9qt.png

The uniform_variate is the ExcelDna function:
object Uniform_variate(double, double, object, object)
{
...
}

The worksheet contains 10 ExcelDna functions. They take 4% of time.
But the Calculate method takes up to 70%.

Is there a way to speed up this process?

As I know, ExcelDna hosts CLR and calls C# functions using CLR
interfaces. How fast these calls are?

Thanks in advance.

Best regards,
Sergey

Govert van Drimmelen

unread,
Mar 27, 2012, 3:04:23 PM3/27/12
to Excel-DNA
Hi Sergey,

Excel-DNA is very fast, and the calls to managed code add little
overhead.
Your COM calls are likely to be a bottleneck, and can possibly be
replaced with C API calls.

-----

Excel-DNA registers a native function pointer with Excel, which calls
into the managed code for every function evaluation.

The overhead when calling an Excel-DNA function from Excel consists of
the native-to-managed transition, the Excel-DNA parameter marshaling
and the managed-to-native transition back. The cost of the native-to-
managed transition is some tens of processor instructions, so it's
pretty light-weight.
All the work of hosting the CLR and setting up the function calls is
done once when the add-in is loaded, so don't affect the individual
function calls.

To really test how fast the Excel-DNA calls are, you should fill up a
sheet with your formula (say 1 million cells) and see how long one
calculation takes. From my testing I expect that Excel can make
between 300,000 and 1,000,000 managed function calls per second.

In your profiling I see:
* For the managed code itself, 500,000 calls to Uniform_variate, took
738 ms
* The marshalling for the two 'object' parameters, 1,000,000 times to
the marshaler, took 282 ms
* The return marshaling for the 'object' return value, 500,000 times,
took 128 ms.
* Some clean-up.

So the managed code for your 500,000 calls was about 1.1 seconds, to
which I'd guess we add 0.5 seconds for the native->managed transition,
say 1.6 seconds in total.

You can check this by filling your sheet with 1,000,000 cells calling
Uniform_variate, and recalculating once.

Now - your call to
ExcelApplication.ActiveSheet.Calculate();
will be a COM call into Excel (actually two COM calls). This is likely
to be much slower than the Excel-DNA function calls.
Your 50,000 calls to Calculate took 13.7 seconds, of which I think the
Excel-DNA functions took 1.6 seconds.

You could test by changing your sheet formula to be trivial (just
adding some numbers), and checking the calculate calls. I expect it
would take as long.

It would be nice if you could confirm this.

------

How can you make the Calculate faster?

Instead of making the COM calls via Application.ActiveSheet.Calculate,
rather implement the recalc logic in a macro (which could be in your
Excel-DNA add-in). From this macro, you can use the C API to
manipulate the sheet - set values using ExcelReference.SetValue) and
recalculate using XlCall.Excel(XlCall.xlcCalculateNow) or
XlCall.Excel(XlCall.xlcCalculateDocument);

I expect this to have much lower overhead than the COM calls.

------

I'd be happy to have a closer look if we need to understand it better.
Let us know what you find.

Regards,
Govert

Sergey Zhilyakov

unread,
Mar 28, 2012, 5:39:17 PM3/28/12
to Excel-DNA
Hi Govert,

thanks for the detailed reply.

The profiling results are not so obvious for me right now. If I fill a
worksheet with the 500,000 Uniform_variate functions and press 'F9',
it takes up to 6-7 seconds, but not 1,6.

I also made a few tests that make me confised. Here they are.

[ExcelCommand(IsExceptionSafe = true, Name = "OnCalculateActiveSheet",
MenuName = "CalculateSheet", MenuText = "Calculate Sheet via C API")]
public static void OnCalculateActiveSheet()
{
try
{
var start = DateTime.Now;

for (var i = 0; i < 10000; i++)
{
XlCall.Excel(XlCall.xlcCalculateDocument);
}

XlCall.Excel(XlCall.xlcMessage, true, (DateTime.Now -
start).TotalSeconds.ToString());
}
catch (Exception ex) { }
}

[ExcelCommand(IsExceptionSafe = true, Name =
"OnCalculateActiveSheet2", MenuName = "CalculateSheet", MenuText =
"Calculate Sheet via COM")]
public static void OnCalculateActiveSheet2()
{
try
{
var start = DateTime.Now;

for (var i = 0; i < 10000; i++)
{
((Worksheet)
ExcelApplication.ActiveSheet).Calculate();
}

XlCall.Excel(XlCall.xlcMessage, true, (DateTime.Now -
start).TotalSeconds.ToString());
}
catch (Exception ex) {}
}

As you can see, the first function uses XlCall.Excel to calculate a
sheet. The second one uses COM.

The problem is that these functions have equal results, i.e. they have
the same performance.

For testing I used the following worksheets:
* 1 RAND() function;
* 100 RAND() functions;
* 1 Uniform_variate function;
* 100 Uniform_variate function;


Are these tests correct?


Best regards,
Sergey

Govert van Drimmelen

unread,
Mar 28, 2012, 8:05:06 PM3/28/12
to exce...@googlegroups.com
Hi Sergey,

You're right! The calculation called via the C API and via COM performs about the same (maybe the COM is even marginally faster).

I tested a bit with the .dna file copied below, in my Excel 2010 (32-bit Excel, 64-bit Win 7 SP1, i7-2620 @ 2.7 GHz).

I found the following, testing different function signatures and combinations of (calculations x cells):
(The functions are defined in the .dna file, and I show the elapsed and resulting calculations per second.)

calcs x cells elapsed per second
=RAND() - C API
10,000 x 500 1.117 4 474 673
=RAND() - COM
10,000 x 500 1.075 4 652 894
=RAND()
1 x 1,000,000 0.155 6 451 613

=MultiThreadDangerous(RAND())
10,000 x 500 4.886 1 023 262
1 x 1,000,000 0.859 1 163 693

=Dangerous(RAND())
10,000 x 500 6.318 791 390
1 x 1,000,000 1.173 852 878

=MultiThread(RAND())
10,000 x 500 17.335 288 438
1 x 1,000,000 3.550 281 690

=DoubleToDouble(RAND())
10,000 x 500 28.828 173 444
1 x 1,000,000 5.892 169 712

=ObjectToString(RAND())
10,000 x 100 11.917 83 911
1 x 1,000,000 11.026 90 692

=Uniform_variant(1,2,RAND(),3)
10,000 x 100 12.580 79 493
1 x 500,000 5.804 86 147
1 x 1,000,000 11.493 87 009

So for Excel-DNA function the calcs-per-second range from 1,000,000 per second to 80,000 per second, depending on the signature, multithread and try-catch choices.
Dangerous (IsExceptionSafe=true) means unhandled exceptions will crash Excel - normally Excel-DNA wraps the function in a try-catch and returns 'object'.

The fastest function is just a double->double, marked IsThreadSafe and IsExceptionSafe.

Your Uniform_variant function calculates about 80,000 cells per second.
So I agree those calls won't be as fast as I suggested.

Do you find about the same?

Regards,
Govert



<DnaLibrary Name="Performance Tests" Language="C#" RuntimeVersion="v4.0">
<Reference Name="Microsoft.Office.Interop.Excel" />
<![CDATA[
using System;
using System.Diagnostics;
using Microsoft.Office.Interop.Excel;
using ExcelDna.Integration;

public static class PerformanceTest
{

[ExcelCommand(Name = "OnCalculateActiveSheetOnce", MenuName = "CalculateSheet", MenuText = "Calculate Sheet once via C API")]
public static void OnCalculateActiveSheetOnce()
{
XlCall.Excel(XlCall.xlcEcho, false);

var stopwatch = Stopwatch.StartNew();
XlCall.Excel(XlCall.xlcCalculateDocument);
stopwatch.Stop();
XlCall.Excel(XlCall.xlcEcho, true);
XlCall.Excel(XlCall.xlcMessage, true, stopwatch.Elapsed.TotalSeconds.ToString("F3"));
}

[ExcelCommand(Name = "OnCalculateActiveSheetOnce2", MenuName = "CalculateSheet", MenuText = "Calculate Sheet once via COM")]
public static void OnCalculateActiveSheetOnce2()
{
var xlApp = (Application)ExcelDnaUtil.Application;
Worksheet activeSheet = (Worksheet)xlApp.ActiveSheet;
xlApp.ScreenUpdating = false;
var stopwatch = Stopwatch.StartNew();

activeSheet.Calculate();

stopwatch.Stop();
xlApp.ScreenUpdating = true;
xlApp.StatusBar = stopwatch.Elapsed.TotalSeconds.ToString("F3");
}


[ExcelCommand(Name = "OnCalculateActiveSheet", MenuName = "CalculateSheet", MenuText = "Calculate Sheet via C API")]
public static void OnCalculateActiveSheet()
{
XlCall.Excel(XlCall.xlcEcho, false);
var stopwatch = Stopwatch.StartNew();

for (var i = 0; i < 10000; i++)
{
XlCall.Excel(XlCall.xlcCalculateDocument);
}

stopwatch.Stop();
XlCall.Excel(XlCall.xlcEcho, true);
XlCall.Excel(XlCall.xlcMessage, true, stopwatch.Elapsed.TotalSeconds.ToString("F3"));
}

[ExcelCommand(Name = "OnCalculateActiveSheet2", MenuName = "CalculateSheet", MenuText = "Calculate Sheet via COM")]
public static void OnCalculateActiveSheet2()
{
var xlApp = (Application)ExcelDnaUtil.Application;
Worksheet activeSheet = (Worksheet)xlApp.ActiveSheet;
xlApp.ScreenUpdating = false;
var stopwatch = Stopwatch.StartNew();

for (var i = 0; i < 10000; i++)
{
activeSheet.Calculate();
}

stopwatch.Stop();
xlApp.ScreenUpdating = true;
xlApp.StatusBar = stopwatch.Elapsed.TotalSeconds.ToString("F3");
}


public static object Uniform_variate(double val1, double val2, object obj1, object obj2)
{
return obj1;
}

[ExcelFunction(IsExceptionSafe=true, IsThreadSafe=true)]
public static double MultiThreadDangerous(double val1)
{
return val1 + 1.0;
}

[ExcelFunction(IsExceptionSafe=true)]
public static double Dangerous(double val1)
{
return val1 + 1.0;
}

[ExcelFunction(IsThreadSafe=true)]
public static double MultiThread(double val1)
{
return val1 + 1.0;
}

public static double DoubleToDouble(double val1)
{
return val1 + 1.0;
}

public static string DoubleToString(double val1)
{
return val1.ToString();
}

public static string ObjectToString(object obj1)
{
return obj1.ToString();
}
}
]]>
</DnaLibrary>

Sergey Zhilyakov

unread,
Mar 30, 2012, 3:32:13 PM3/30/12
to Excel-DNA
Hi Govert,

thanks for the reply.

I can confirm these results.


The performance was improved by replacing
ExcelApplication.ActiveSheet.Calcuate() with
ExcelAplication.ActiveSheet.UsedRange.Calculate(). At least it works
for the worksheets I have deal with. Moreover, if the calculation is
performed in the loop, the performance can also be improved a bit by
obtaining the UsedRange before the loop.

Maybe it might be helpful for someone else.

Best regards,
Sergey

Govert van Drimmelen

unread,
Mar 31, 2012, 11:21:31 AM3/31/12
to Excel-DNA
Hi Sergey,

The other call to compare would be Application.Calculate, which should
also just calculate whatever is needed, and the C API call
xlcCalculate (instead of xlcCalculateDocument).

I guess the other part of your calculation loop we have not discussed
is that you are probably changing some cell values in each loop. Again
this can be done with the COM interfaces using Range.Value = ... or
with the C API using ExcelReference.SetValue(..). In both cases one
would rather make a single call that sets an array in a large range,
rather then many individual calls.

Have you done any timing or testing on this aspect?

Regards,
Govert
Reply all
Reply to author
Forward
0 new messages