Excel C-API vs Interop

66 views
Skip to first unread message

Terry Aney

unread,
May 15, 2024, 3:22:49 PMMay 15
to Excel-DNA
Seeing some strange results in a sense that Excel C Api is performing way slower than Interop COM objects.  I'm wondering if (with the small context of code) you have any guesses.

This message is displayed after running code below:

---------------------------
CalcEngine Audit
---------------------------
00:00:00.0313122
00:00:02.2352662
---------------------------
OK  
---------------------------

var sw = System.Diagnostics.Stopwatch.StartNew();
var configuration = new ExcelCalcEngineConfigurationFactory( application.ActiveWorkbook ).Configuration;
sw.Stop();
var elapsedInterop = sw.Elapsed;

sw = System.Diagnostics.Stopwatch.StartNew();
configuration = new ExcelDnaCalcEngineConfigurationFactory().Configuration;
sw.Stop();
var elapsedApi = sw.Elapsed;

MessageBox.Show( $"{elapsedInterop}{Environment.NewLine}{elapsedApi}", "CalcEngine Audit", MessageBoxButtons.OK, MessageBoxIcon.Information );

My two classes are derive from generic class, their signatures are:

public class ExcelCalcEngineConfigurationFactory : CalcEngineConfigurationFactory<MSExcel.Workbook, MSExcel.Worksheet, MSExcel.Range>
public class ExcelDnaCalcEngineConfigurationFactory : CalcEngineConfigurationFactory<ExcelWorkbook, ExcelWorksheet, ExcelReference>

The base `CalcEngineConfigurationFactory` simply loops tabs and ranges and grabs values to build a configuration.  I can show that code if needed.  There are also some extensions you'll see in the ExcelDna* version - I could show, but again they just wrap XlCall.Excel calls.  

The two generic classes are as follows.  If you have any guesses why the Interop would work 150x faster than C API let me know :)

public class ExcelDnaCalcEngineConfigurationFactory : CalcEngineConfigurationFactory<ExcelWorkbook, ExcelWorksheet, ExcelReference>
{
    public ExcelDnaCalcEngineConfigurationFactory() { }

    protected override string FileName => (string)XlCall.Excel( XlCall.xlfGetDocument, (int)GetDocumentType.ActiveWorkbook );
    protected override string Version => ExcelApi.GetReference( "Version" ).GetText() ?? "Unknown";
    protected override ExcelWorksheet[] Worksheets => new ExcelWorkbook( ExcelApi.Selection ).Worksheets.ToArray();
    protected override ExcelWorkbook GetWorkbook( ExcelReference range ) => new( range );
    protected override ExcelWorkbook GetWorkbook( ExcelWorksheet sheet ) => sheet.Workbook;
    protected override ExcelWorksheet GetSheet( ExcelReference range ) => new( range );
    protected override string GetName( ExcelWorksheet sheet ) => sheet.Name;
    protected override string? RangeTextOrNull( string name ) => name.GetReferenceOrNull()?.GetText();
    protected override string? RangeTextOrNull( ExcelWorksheet sheet, string name ) => name.GetReferenceOrNull( sheet.Name )?.GetText();
    protected override ExcelReference GetRange( string name ) => name.GetReference();
    protected override ExcelReference GetRange( ExcelWorksheet sheet, string name ) => name.GetReference( sheet.Name );
    protected override ExcelReference Offset( ExcelReference range, int rowOffset, int columnOffset ) => range.Offset( rowOffset, columnOffset );
    protected override ExcelReference EndRight( ExcelReference range ) => range.End( DirectionType.ToRight );
    protected override bool RangeExists( string name ) => name.GetReferenceOrNull() != null;
    protected override bool RangeExists( ExcelWorksheet sheet, string name ) => name.GetReferenceOrNull( sheet.Name ) != null;
    protected override string GetAddress( ExcelReference range ) => range.GetAddress().Split( '!' ).Last();
    protected override string GetText( ExcelReference range ) => range.GetText() ?? "";
    protected override string GetFormula( ExcelReference range ) => range.GetFormula() ?? "";
}

public class ExcelCalcEngineConfigurationFactory : CalcEngineConfigurationFactory<MSExcel.Workbook, MSExcel.Worksheet, MSExcel.Range>
{
    private readonly MSExcel.Workbook workbook;

    public ExcelCalcEngineConfigurationFactory( MSExcel.Workbook workbook ) => this.workbook = workbook;
   
    protected override string FileName => workbook.Name;
    protected override string Version => workbook.RangeOrNull<string>( "Version" )!;
    protected override MSExcel.Worksheet[] Worksheets => workbook.Worksheets.Cast<MSExcel.Worksheet>().ToArray();
    protected override MSExcel.Workbook GetWorkbook( MSExcel.Range range ) => GetWorkbook( range.Worksheet );
    protected override MSExcel.Workbook GetWorkbook( MSExcel.Worksheet sheet ) => ( sheet.Parent as MSExcel.Workbook )!;
    protected override MSExcel.Worksheet GetSheet( MSExcel.Range range ) => range.Worksheet;
    protected override string GetName( MSExcel.Worksheet sheet ) => sheet.Name;
    protected override string? RangeTextOrNull( string name ) => workbook.RangeOrNull<string>( name );
    protected override string? RangeTextOrNull( MSExcel.Worksheet sheet, string name ) => sheet.RangeOrNull<string>( name );
    protected override MSExcel.Range GetRange( string name ) => workbook.RangeOrNull( name )!;
    protected override MSExcel.Range GetRange( MSExcel.Worksheet sheet, string name ) => sheet.Range[ name ];
    protected override MSExcel.Range Offset( MSExcel.Range range, int rowOffset, int columnOffset ) => range.Offset[ rowOffset, columnOffset ];
    protected override MSExcel.Range EndRight( MSExcel.Range range ) => range.End[ MSExcel.XlDirection.xlToRight ];
    protected override bool RangeExists( string name ) => workbook.RangeOrNull( name ) != null;
    protected override bool RangeExists( MSExcel.Worksheet sheet, string name ) => sheet.RangeOrNull( name ) != null;
    protected override string GetAddress( MSExcel.Range range ) => range.Address;
    protected override string GetText( MSExcel.Range range ) => range.GetText();
    protected override string GetFormula( MSExcel.Range range ) => ( (string)range.Formula )!;
}

Govert van Drimmelen

unread,
May 15, 2024, 4:28:35 PMMay 15
to exce...@googlegroups.com

Hi Terry,

 

I think you’d need to unwind the set of C API calls (or both sets of calls) a bit to figure out where the difference comes from.

 

Various things make the C API more limited than the COM object model, and could causes your code to take a slower path.

For example, sometimes when using the C API you need to change the selection or the active sheet first, then you can read or change some part of the workbook.

In the COM object mode, you could directly get a reference to that range and modify it.

Something like that could cause a big difference.

 

Also, if you have COM event handlers set up in your code, or in VBA code in the workbook , that might causes context switching the causes delay.

 

In general, I don’t think the COM object model should be ‘slow’, but your case does ask for some explanation.

Could you try to narrow the comparison down a bit, to see what sequence of calls is so slow?

 

-Govert

--
You received this message because you are subscribed to the Google Groups "Excel-DNA" group.
To unsubscribe from this group and stop receiving emails from it, send an email to exceldna+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/exceldna/97355daf-623b-40eb-8fce-a32d320a6f98n%40googlegroups.com.

Terry Aney

unread,
May 15, 2024, 10:10:03 PMMay 15
to Excel-DNA
So I just used standard Stopwatch object.  And here's where I discovered.

Activate( calcEngineSheet.Sheet );

var sw = Stopwatch.StartNew();

var inputs = tabInputs.Inputs.Concat( resultTabInputs?.Inputs ?? Array.Empty<CalcEngineCell>() ).ToArray();
string? getInputAddress( string name ) => inputs.FirstOrDefault( i => i.Name == name )?.Address;

var tabDefTables = GetTabDefTables( calcEngineSheet.Sheet );

Console.WriteLine( $"{GetType().Name} - GetTabDefConfiguration.GetTabDefTables, {sw.ElapsedMilliseconds}ms" );
sw.Restart();


This chunk of code took ~10x longer in C API (40ms vs 600ms) than COM.  My implementation of the 'Activate' method I just added based on your comment was as follows for the two types:

protected override void Activate( ExcelWorksheet sheet ) => XlCall.Excel( XlCall.xlcWorkbookActivate, sheet.Name );
protected override void Activate( MSExcel.Worksheet sheet ) => sheet.Activate();

Finally, the GetTabDefTables was trimmed down a lot too for this test and I ended up with only this:

TabDefTables GetTabDefTables( TWorksheet sheet )
{
    var inputTables = new List<CalcEngineTable>();
    var dataTables = new List<CalcEngineTable>();
    var globalTables = new List<CalcEngineTable>();

    var table = Offset( GetRange( sheet, "StartTables" ), 1, 0 );

    string? tableName;
    while ( table != null && !string.IsNullOrEmpty( tableName = GetText( table ) ) )
    {
        var isGlobalTable = tableName.StartsWith( "<<" );
        var isDataTable = !isGlobalTable && tableName.StartsWith( "<" );
        var isConfigureUIDataTable = tableName.StartsWith( "[" );

        tableName = tableName.Replace( "<", "" ).Replace( ">", "" ).Replace( "[", "" ).Replace( "]", "" );
        var isWorkTable = tableName.Contains( "/work-table" ) || tableName.Contains( "/off" );

        var hasColumns = !string.IsNullOrEmpty( GetText( Offset( table, 1, 0 ) ) );
        var addressLeft = GetAddress( Offset( table, 2, 0 ) );
        var addressRight = !string.IsNullOrEmpty( GetText( Offset( table, 1, 1 ) ) )
            ? GetAddress( Offset( EndRight( Offset( table, 1, 0 ) ), 1, 0 ) )
            : GetAddress( Offset( table, 2, 0 ) );

        table = hasColumns ? Offset( GetRange( sheet, addressRight ), -2, 2 ) : null;
    }

    return new()
    {
        Tables = inputTables.ToArray(),
        DataTables = dataTables.ToArray(),
        GlobalTables = globalTables.ToArray()
    };
}

The tabs that are being processed look something like the following and it is simply just getting the table name (row 11), then walking all the columns for the table (row 12) until it finds a blank, then looks for a new table and continues if finds one (note this code has all the code out that actually builds the CalcEngineTable items in the unused lists but that didn't affect speed).

Screenshot 2024-05-15 210832.png

Provide anymore insight?  Did I use the right C API to 'select' sheet before processing?

Terry Aney

unread,
May 17, 2024, 12:10:14 PMMay 17
to Excel-DNA
One other interesting fact.  If I open the Spreadsheet and run my code, the C API takes 165ms and Interop takes 30ms.  Then if I modify the sheet to cause an 'audit error' (i.e. duplicate table name) and run again, it fails.  Then if I 'undo' and run again, then that is when the difference is really visible.  C API takes 2144ms and Interop still takes 30ish-ms.

Still digging...

Terry Aney

unread,
May 17, 2024, 12:11:33 PMMay 17
to Excel-DNA
@Govert, one more question.

For example, sometimes when using the C API you need to change the selection or the active sheet first, then you can read or change some part of the workbook.

So do you suggestion setting active 'book' and 'sheet' before any reference?  Any samples in your github account?

Thanks,
Terry

Govert van Drimmelen

unread,
May 17, 2024, 2:10:06 PMMay 17
to exce...@googlegroups.com

Hi Terry,

 

I don’t have a particular case where you need to set the sheet – I just remember some cases where you have to use the selection when doing things with the C API, where the COM object model has a more direct approach.

 

It’s hard to say anything more without a direct list of the sequence of C API calls vs. COM object model calls that you’re making.

I don’t know if your ‘tables’ are defined in the sheet as real Excel Tables. If so, your COM object model might be reading these directly, since the Tables are represented in the COM object model (I think they’re ListObjects?). Since the Tables feature is new (only added about 20 years ago) the C API does not have a way to directly deal with them.

 

There might be other ways to speed up what you are doing – using the C API or COM.

If I understand right you are scanning across one or two specific rows of the sheet, and find contiguous sets of column headers to detect tables.

This can be made fast by reading the whole row(s) in a single call, and then processing the resulting array in code without further calls to Excel.

 

This approach does not answer why some C API calls might be particularly slow, of course.

Terry Aney

unread,
May 17, 2024, 8:25:35 PMMay 17
to Excel-DNA
OK more interesting info.  In my `End` function I was using your RestoreSelection helper (this code has been carried forward forever)...

    public static ExcelReference End( this ExcelReference reference, DirectionType direction, bool ignoreEmpty = false )
    {
        ExcelReference end = null!;

        reference.RestoreSelection( () =>
        {
            var value =
                ignoreEmpty ? ExcelEmpty.Value :
                direction == DirectionType.Down ? reference.Offset( 1, 0 ).GetValue() :
                direction == DirectionType.ToRight ? reference.Offset( 0, 1 ).GetValue() :
                direction == DirectionType.ToLeft ? reference.Offset( 0, -1 ).GetValue() :
                /* DirectionType.Up */                reference.Offset( -1, 0 ).GetValue();

            var isEmpty = value == ExcelEmpty.Value || ( value.GetType() == typeof( string ) && string.IsNullOrEmpty( (string)value ) );

            if ( !ignoreEmpty && isEmpty )
            {
                end = reference;
            }
            else
            {
                // Govert talks about 'messiness' at http://stackoverflow.com/a/10920622/166231, but pretty straight forward
                XlCall.Excel( XlCall.xlcSelectEnd, (int)direction );

                var selection = ( XlCall.Excel( XlCall.xlfSelection ) as ExcelReference )!;
                var row = selection.RowFirst;
                var col = selection.ColumnFirst;

                end = new ExcelReference( row, row, col, col, selection.SheetId );
            }
        } );

        return end;
    }

This was essentially the culprit.  But I'm confused.  Points of interest...

1. When I use this everything works, but...grr, just figured out.  I was going to say, I don't understand why it works because I'd still expect things to not be in proper selection, but in your helper, the last thing you do before executing the Action passed in is:

                // Select the range caller desires to be active...
                XlCall.Excel( XlCall.xlcSelect, reference );

So that is how the proper cell was selected before I called my 'xlcSelectEnd'.

2. This question still remains though.  When I didn't use your RestoreSelection function but instead just called select myself, like the following, everything was then fast.  C Api was about 50ms with COM being 35-40ms. I don't know why RestoreSelection would cause so much slowness looking at the code. 

    protected override ExcelReference EndRight( ExcelReference range )
    {
        var valToRight = Offset( range, 0, 1 ).GetValue();
        var isEmpty = valToRight == ExcelEmpty.Value || ( valToRight.GetType() == typeof( string ) && string.IsNullOrEmpty( (string)valToRight ) );

        if ( isEmpty )
        {
            return range;
        }
        else
        {
            var sheetName = (string)XlCall.Excel( XlCall.xlSheetNm, range );
            XlCall.Excel( XlCall.xlcWorkbookSelect, new object[] { sheetName } );
            XlCall.Excel( XlCall.xlcSelect, range );
            XlCall.Excel( XlCall.xlcSelectEnd, (int)DirectionType.ToRight );

            var selection = ( XlCall.Excel( XlCall.xlfSelection ) as ExcelReference )!;
            var row = selection.RowFirst;
            var col = selection.ColumnFirst;

            return new ExcelReference( row, row, col, col, selection.SheetId );
        }
    }

3. As you mentioned, I could probably just read entire two rows and walk the object[,] array and it would eliminate the need for EndRight.  But I'll eventually need 'EndDown' (to read actual data (variable rows) of each table).  So contemplating whether or not to remove it.

Any comments/aha's are welcome.

Thanks.
Reply all
Reply to author
Forward
0 new messages