Noise overwhelming the Transfer Entropy estimates - Kraskov Algorithm1

65 views

Skip to first unread message

Aleksander Janczewski

unread,

Feb 4, 2022, 1:52:31 PM2/4/22

to Java Information Dynamics Toolkit (JIDT) discussion

Dear Professor Lizier,

Thank you very much for all of your suggestions in the previous threads. I very much appreciate all your help!

I have one more question that I could not find the answer to in either JIDT or IDTxL discussion boards or any other publications. I am working with very unique data which also makes it hard to find any other publications on related problems.

As I have mentioned in my previous threads, I am trying to estimate Transfer Entropy and Conditional Transfer Entropy using Kraskov estimator Algorithm1. I am noticing that the size of the noise that I am adding to the data absolute overwhelms the estimates that the estimator yields. There is not a single significant figure that does not change after rerunning the code and estimates are all over the place. Also, the magnitude of estimated TE’s very significantly change, as I change the magnitude of the noise. Finally, JIDT just breaks if I do not use any noise. I have already experimented with various magnitudes of the noise, down to 1E-15, which seems to be somewhat helping to minimize the variability of the estimates. Also, increasing the number of nearest neighbors “k” also helps of course, but then I am making the bias-variance tradeoff.

With that being said, my questions are

1. Do you know of any alternatives to adding the noise to the data? It really seems that the magnitude of the noise directly impacts the magnitude of TE estimates.

2. Would you say that trying to absolutely minimize the noise (up to the machine precision) and significantly increasing the “k” parameter would be the right approach for this kind of problem? Is there any maximal “k” value you would consider to be acceptable?

3. How would you approach this kind of problem / is there any other way to set JIDT parameters to minimize this variability?

I am also looking for a solution that avoids doing multiple reruns of the estimation procedure, since, I think with the current variability I would need to do a lot of reruns to have a somewhat reliable estimate.

I am attaching an example of the data that I am working with. It is 6000 points for 7 variables. The data is already differenced (as the original data is not stationary), it is not standardized and the noise is not added to it. It is ready to be used in any JIDT code or autoanalyzer that can be run with default parameters, just so you can see what I am referring to in this thread.

I would love to hear your opinion on this matter.

With Best Regards,

Aleksander

data_example.txt

Reply all

Reply to author

Forward

0 new messages