RTD volatility

88 views
Skip to first unread message

Toby Jacobs

unread,
Jul 23, 2021, 4:58:10 AM7/23/21
to Excel-DNA
I am making use of the RTD functionality awesomely provided by ExcelDNA in the following manner:

[ExcelFunction(Description = "Observable GetMetric stub", IsHidden = true, IsVolatile = false)]
public static object MyFunction(object p1, object p2, object p3, object p4)
{
    string sender = xlapp.Caller.Address(External: true);
    List<object> parameters = PrepareParameterList(p1, p2, p3, p4);
    object[] asyncParams = new object[] { sender, parameters.ToArray() };
    IProductSolutionsComServer currentInterface = CurrentAddinComInterface;     // Interface into the VSTO addin where the work will take place
    string key = currentInterface?.BeginMyFunction(sender, parameters.ToArray());
    IObservableShim observableShim = currentInterface?.WatchKey(key);           // Returns an observable-ish provided by the rest of the addin that will return the value
    IExcelObservable observableWrapper = new ExcelObservable(observableShim);   // More or less the example that's provided in the RTD samples
    return ExcelAsyncUtil.Observe(nameof(MyFunction), asyncParams, () => observableWrapper);
}

private static List<object> PrepareParameterList(object p1, object p2, object p3, object p4)
{
    var parameters = new List<object>();
    if (!(p1 is ExcelMissing)) parameters.Add(p1);
    if (!(p2 is ExcelMissing)) parameters.Add(p2);
    if (!(p3 is ExcelMissing)) parameters.Add(p3);
    if (!(p4 is ExcelMissing)) parameters.Add(p4);
    return parameters;
}

This works well in that in the workbook, I see an #N/A returned until the observer returns a value. The observable then (behind the scenes) has OnCompleted called, which I believe should mean Excel gets told "this no longer needs to be watched", and hopefully the cell is calculated and stays that way.

I've noticed, though, that when an input parameter to ONE of my functions changes, all cells containing MyFunction also recompute even though they might not depend on that cell. Ideally I'd like to retain the standard levels of volatility in Excel - i.e. unless an argument changed, or one of its dependencies changed, the cell is not marked as requiring recalculation. Any idea what I'm doing that would mean this is not what I would get?

Thanks in advance, and doubly so to Govert for writing this awesome toolkit!

Govert van Drimmelen

unread,
Jul 23, 2021, 12:48:08 PM7/23/21
to Excel-DNA
--------------------------------------------------
Excel-DNA is now registered on GitHub Sponsors.
You can add Excel-DNA support with easy billing through a corporate GitHub account.
--------------------------------------------------

Hi Toby,

The RTD cells should recalculate when the inputs change or when the observable signals a new value, so your expectation of the correct behaviour sounds right.
I'm not sure exactly how you're setting up the observable, but there are some aspects of your wrapper function that I should flag:

I would use this structure instead - with some added comments to explain:

public static object MyFunction(object p1, object p2, object p3, object p4)
{
    // Use the C API to get the caller - this returns an ExcelReference object that can be added to the unique key
    // I'm prefer to avoid the COM object model in UDF methods
    var caller = XlCall.Excel(XlCall.xlfCaller) as ExcelReference;


    // Use the paramters as they are to make the unique key
    // The ExcelMissing items are OK in here
    var ayncParams = new object[] { caller, p1, p2, p3, p4 };

    // This call will check the params to see whether we must make a new observable or return the current value directly
    // Every update of the observable will cause the UDF to be called again
    // but we only want to create a new observable if the arguments have changed
    return ExcelAsyncUtil.Observe(nameof(MyFunction), asyncParams, () =>
    {
        // All your interesting code to set up the observable should go in here
        // This delegate only gets invoked for the first time a new set of arguments is seen

        // If you need a string for your own BeginMyFunction call you need to take some care here
        // You can get a sheet reference with xlSheetNm, but need to build the rest of the address yourself
        string sender = ....
        List<object> parameters = ...
        object[] asyncParams = ...

        IProductSolutionsComServer currentInterface = CurrentAddinComInterface;     // Interface into the VSTO addin where the work will take place
        string key = currentInterface?.BeginMyFunction(sender, parameters.ToArray());
        IObservableShim observableShim = currentInterface?.WatchKey(key);           // Returns an observable-ish provided by the rest of the addin that will return the value
        IExcelObservable observableWrapper = new ExcelObservable(observableShim);   // More or less the example that's provided in the RTD samples

        return observableWrapper;
    });

}

If you need to dig into this a bit more deeply, you are also welcome to contact me directly to arrange for a support call.

-Govert

Toby Jacobs

unread,
Jul 27, 2021, 3:56:34 AM7/27/21
to Excel-DNA
Hi Govert,

Thanks for the above. I've made the changes you suggested, which unfortunately didn't make a difference as far as volatility was concerned. However, I think I've worked out what's going on, and it might be worth considering a settings tweak on the side of DNA in response - but you know... might be.

Long story short, I can get what looks like the behaviour I want by suppressing the OnComplete() notification on the observable I return. However, I'm not 100% convinced this is actually doing what I want it to - for instance is Excel, internally, continually checking all those RTD topics? This is important to my use-case, since ideally I want ~300k of these to be present in the same workbook, and I don't want Excel to be checking cells that definitely don't have a new value. I also want to be conscious of memory usage etc.

The actual execution of MyFunction is supposed to have a fairly similar lifecycle to a CUBEVALUE formula - which is to say:
  1. Initial request for a value comes in
  2. The addin goes away and fetches the value, but this will take time, and there's no reason for the Excel UI to block while waiting - effectively this is the #Getting_Data step
  3. It sets up a ReplaySubject with a size of 1 to represent the value, and returns it
    1. This subject gets kept in a dictionary that is unique to the cell, so there will be a lot of them, and I don't want them to live forever, since that will result in a very big dictionary (I'm expecting to be computing in the region of 300k formulae at times, which is a sizable dictionary)
  4. When the value lands:
    1. It gets pushed to the appropriate subject internally
    2. And an OnComplete message gets pushed too
  5. The expectation is that, on the DNA side, the OnComplete means "this subject isn't going to say any more, so just keep the last value and don't bother including an RTD execution, and if you can, drop the reference to the IObservable"
I believe what is happening instead is that DNA is:
  1. Retaining the last value
  2. Dropping the observable
  3. Releasing the topic - meaning it is effectively saying Ok, this time the value was X, but maybe next time there will be a different result, so we might need to check that topic again.
The thing is, the topic itself is retained - but now it's back in an unknown state. So whilst Excel won't poll it for future numbers, it will happily start all over again from the top when the next outer call comes in. When another RTD call then lands for some other cell, what then appears to happen is:
  • The list of topics that are present in the workbook is checked. This covers all instances of MyFunction
  • A lot of them are identified as potentially needing work, since their observation has completed
  • ... Therefore the setup code is executed again, and a new request is dispatched to my addin, which results in a fresh ReplaySubject representing a new transient value
  • ... And they all recalculate, even if no arguments have changed.
Long story short, without the ReplaySubject.OnComplete() call, everything behaves as you'd expect it to - I just don't have visibility on the memory implications of having a lot of additional subjects kicking around on the DNA side, or on any CPU implications around when Excel is actually checking to see if new data has landed. For a few hundred cells, so far, it's fine, but I still need to check at around the limits of the use-case I'm expecting.

I'll see if I can set up an example project somewhere that illustrates this in case you want to play around, though that probably won't come until later.

Thanks,
Toby

Govert van Drimmelen

unread,
Jul 27, 2021, 2:34:14 PM7/27/21
to Excel-DNA
Hi Toby,

You are right that after the RTD topic is disconnected (following the OnComplete call) a recalculation of the cell would cause a new topic to be created again (since there is no current topic with the same parameter set) and the inner delegate will create a fresh observable and possibly back-end call. So it might be that you want to either keep the RTD topics alive, or to keep your own cache so that you can short-circuit the RTD calls for data you already have in the cache and just return these immediately. Regarding the memory usage of RTD topics internally, you should definitely profile it - you might find that 300k topics are not a problem at all. Excel does not poll the RTD topics or "check to see if new data has landed" - they are only read and the relevant calling cells recalculated if a changed value is pushed. 

A cell where the RTD topic has completed should only recalculate if there is a change of the input argument (or a recalculation of the sheet). So while you are right about the topic being released after the OnComplete, this does not imply that the cell would be recalculated again if the inputs don't change. In this sense it sounds like you are seeing a volatility of the cell that I don't expect, and RTD cells should not be more 'volatile' than regular cells.

It sounds a bit like you're duplicating some of the internal RTD mechanism on your side too. For example, I'm not sure why you need information about the calling cell for your RTD call - the link between a cell and the RTD topic is already managed by Excel. You just have to disambiguate RTD topics according to the parameters use to fetch the back-end data.

-Govert

Toby Jacobs

unread,
Jul 28, 2021, 3:39:55 AM7/28/21
to Excel-DNA
Hi Govert,

Yeah, I think for now just leaving them alive is probably the right call, but will profile in due course. I may retain them in cache too, but at the moment, the approach is to drop them once they're done. I just previously figured dropping them completed made sense, but seems that was incorrect. Really good to know that the model is Observable get value --> Excel gets observable added to collection that it should check when it gets to it --> Excel checks only those observables in the collection of "some change has happened" --> number updates (or not) though. The disadvantage of having ready made tools is that kind of thing ends up being obfuscated!

The cell identity is actually kinda useful to have in the backend, if there is some orientation required. I mainly use it because the VSTO side is an application-level addin, not a workbook one, so consequently, there's a need to find out which workbook component is responsible for doing the work; since Excel computes all workbooks when recomputing, I can't just assume it's the active workbook where the cell comes from. Since the workbook and cell ref are more or less the same cost to obtain (or certainly were when using the Caller approach), I get the literal location "for free".

There are though a couple of scenarios where it could be useful if my function goes in that direction - for instance, the cell location might provide context that would mean the same arguments would evaluate differently. I'm thinking if I have a dependency on e.g. the list object against which a cell is part. I don't yet, but it's the kind of place I can imagine going.

That said, I'm definitely seeing recalculation of cells where the inputs don't change. As I say, I'll see about a toy example for you if I can replicate, since it'd be good to give you the chance to look into it if it isn't the behavior you're expecting.

Thanks,
Toby
Reply all
Reply to author
Forward
0 new messages