Marshaling and type mapping design

276 views
Skip to first unread message

Govert van Drimmelen

unread,
Nov 29, 2010, 3:09:29 AM11/29/10
to Excel-Dna
I got the following message last week from an Excel-Dna user. I'm
reposting it here for reference and for further discussion.

- - - - - - - - - - - - - - -

Govert,

I have been using ExcelDna for my Excel add-in application and I have
a few suggestions that you might find useful. The main motivation for
me to approach you with these suggestions is very practical – to make
ExcelDna easier to use and eliminate some duplication of logic.

1) When C# nullable data type (array, string or object) is marshaled
from Excel to C# application and Excel argument is omitted, null
should be passed to C# code. Currently user C# code is not even
invoked – ExcelDna fails and returns #VALUE error to Excel. This
behavior becomes especially big obstacle to marshaling double[] or
double[,] data types, because I have to marshal them as objects and
then check every special case on my side (double, string, etc.) and
unbox every element of the array into double in an intelligent way.
This logic, in my humble opinion, belongs to ExcelDna rather than to a
user application. C# nullable data types, such as arrays and strings,
were made nullable for a reason and null is an equally valid value for
these data types. So, my suggestion is to pass those null values to
user application and let the user decide in each particular case
whether null is a good value or not, and if it is not, let the user
handle the situation properly (e.g., by producing a meaningful error
message).

2) By the same logic, I suggest treating non-nullable types in a
similar fashion. For example, missing bool argument becomes false,
and missing DateTime argument becomes DateTime.MinValue, etc. For non-
nullable types assigning some default value to a missing parameter is
still better than returning control back to Excel with a meaningless
#VALUE error. Users who want to handle missing arguments properly can
use C# object types instead, but my experience has been that assigning
default values is good enough for most cases. By doing that you are
not taking away the route for proper handling, but at the same time
you would allow users like myself to save on some extra code that has
to be put in place to handle object data type.

3) It would be better if ExcelDna did not have the types ExcelEmpty
and ExcelMissing. Standard C# type Missing is good enough already,
and adding ExcelEmpty and ExcelMissing brings no value. It makes user
applications bigger, because now everyone has to check for ExcelEmpty
and ExcelMissing in addition to checking for the standard C# Missing.
The way I see it, ExcelEmpty and ExcelMissing are just unnecessary
overhead.

4) I recommend adding a couple more data types for marshaling – string
arrays, both one-dimensional and 2-dimensional, as in the case with
double arrays. Currently I have to pass object and have all the
marshaling logic around that on my side.

These suggestions are based on my experience with ExcelDna, and quite
frankly, I cannot see how others might have avoided dealing with
exactly the same issues. I had to create a whole wrapper around
ExcelDna, but I would rather use more of your code and write less of
my own to marshal the data. Resolving all these issues at ExcelDna
level is going to be more efficient from performance stand point,
would eliminate some code redundancy and save people time. What do
you think?

Govert van Drimmelen

unread,
Nov 29, 2010, 3:11:24 AM11/29/10
to Excel-Dna
Hi,

Excel-Dna has to mediate between the Excel C API and the .NET type
system. This leads to a certain tension and many choices in the
design, and I may well have taken an approach that is not ideal.
I did not start off with any knowledge or experience with the C API,
and things have developed over time. I am also likely to have to
support the project for many years to come, so I am often a bit
cautious in making changes that I feel uncertain about. The details of
the marshaling, and particularly the topics you discuss here which
concern the integration interface presented to Excel-Dna users, are
aspect I have been conservative about for a while.

I’d like to give you some background on how some of the design aspects
came about, and then look forward a bit at how I’d like to see it
evolve.

My departure point for the marshaling types was the following:
* I wanted to present an interface that was faithful to the Excel C
API, while eliminating the need to expose the XLOPER types to Excel-
Dna users.
* These types should be consistently mapped when used as parameters,
return types and as the argument values when calling back into Excel.
* The type mapping should not lose any information or make any calls
to Excel impossible.
* Type conversions should happen inside Excel where possible, and
should emulate how Excel deals with the conversion if implemented in
Excel-Dna.

So I started by mapping types to the different parameter registration
options that Excel supported. I implemented the mapping using the
custom marshaling support in .NET, which was a natural choice at the
time, but probably a mistake in hindsight. To make consistent bi-
directional calls possible, I wanted to be sure that all the different
types that Excel distinguished could be represented by different types
in .NET.

1. As an example of how these things came about, consider your first
point about the double[,] array types:
I register these as double array types with Excel. I chose to do so,
because this is the most natural mapping I could see, and I liked the
fact the Excel will do the conversion to doubles. But a consequence of
registering as this double[,] type with Excel, is the behavior you see
now – Excel never calls into Excel-Dna if the argument is omitted. I
could certainly change it so that the argument is registered as the
XLOPER type (like object) and then do a smart conversion that passes a
null or whatever. The problem with this is that the Excel-Dna code
would now have to implement the policy of how to present a string or
an empty cell or something that is passed as part of the ‘double’
array. I would now have some choices: I could make that entry
double.MinValue or double.NaN or 0. I could return a detailed string
to Excel containing the type error, etc. But all of these
implementation choices seem inappropriate to me as part of the low-
level marshaling - Excel should decide how to deal with functions that
take double[,] which are called with bad parameters or missing values.
In this case Excel decides to return #VALUE without ever calling the
function. I think this is OK, but I also appreciate that other options
could be useful from a user’s perspective. My current design choice is
to prefer that Excel-Dna does not have to hard-wire some specific
design choice like this, but just exposes the Excel way of things.
Also, I’m not sure what the performance impact would be of moving from
registration with Excel as double[,] to registration as object. Until
I’m in a position to measure this, I will rather be cautious and keep
the special double[,] registration an option. After all, as I
understand you do now, you can register the argument as type object
and call a conversion function at the start of your function to
implement your required semantics. So I choose the keep the
consistency with Excel and the possible performance benefit, at the
cost of forcing an extra conversion line if you require different
behavior.

2. Your point 2 is probably best catered for by supporting the
optional and default parameters in VB and C# 4. Passing in false for
missing bools just seems wrong to me. I would rather add support for
nullable types to deal with this….

3. In early versions of Excel-Dna I mapped empty values to nulls and
missing arguments to System.Missing. However, this caused a problem
since nulls do not cause the marshaling conversions to be called
in .NET. Thus it seemed like I could not build a generic marshaling
interface if null values were involved – because nulls are treated
exceptionally by .NET. So I added the ExcelEmpty value to indicate
empty cells and then, having ExcelError and ExcelEmpty, I decided to
add the ExcelMissing type for consistency. This meant that all the
Excel types map to unique .NET types. With a different marshaling
layer, I now think the nulls could have been dealt with, but at the
time I decided to rather keep the options for future mapping open by
creating the explicit types. I had to make the trade-off between
sticking to fewer concepts, and opening up future flexibility in the
type mapping.

4. Wider type support is certainly a good idea. But for string[,] I
would again have to make some choices – do I call Excel to convert
every entry to a string or implement it myself? With locale-
awareness…? Etc.

I think it is unlikely that I would implement the changes in your
suggestions 1. and 3. directly. To add support for optional and
default values and some more types like string[,] and Int64 might come
as interim improvements, but I would actually like to make the Excel-
Dna type conversion and marshaling much more flexible and extensible.
My plans over the medium term (hopefully this means some time next
year) is to implement a customizable marshaling interface. I did some
work towards this a year ago, but could not get everything together at
the time. Basically I want to make it possible for you to register the
functions in your conversion wrapper with Excel-Dna, so that you get
whatever conversion and transformation you would like. Similar to the
current Integration.RegisterUnhandledExceptionHandler(…) option there
might be Integration.RegisterTypeConversion(…). In that way, I would
like to make it possible for you to get the exact type conversion
design you suggest, but I want the user to make the decisions about
what to do to the exceptional and problematic cases in a way the suits
their environment, rather than having me decide on some compromise
that is imposed on everyone. This should also go with some pre- and
post method call extension points so that object handles and async
calls can be implemented transparently.

So I have some ideas and have tried to put some code together, but it
is not ready yet and not of the immediate path.
I have not considered work on the marshaling as a high priority this
year since you can implement whatever semantics you require in the
current design, you just need to add the parameter-mapping calls in
the start of your function. In contrast, adding Ribbon, RTD and .NET
4 support, and 64-bit support hopefully in the first part of next
year, have been higher on my list since there are no work-arounds. As
the library matures, I would certainly revisit the marshaling and
hopefully implement a flexible enough story so the you could get the
exact interface you suggest.

I hope this give you some insight into my thinking. Over time I hope
there will be a satisfactory answer to all the aspects you mention.

Kind regards,
Govert
Reply all
Reply to author
Forward
0 new messages