Turn off Bias Correctioin?

Bowen Zheng

unread,

Aug 5, 2022, 9:43:03 AM8/5/22

to Java Information Dynamics Toolkit (JIDT) discussion

Dear Dr. Joseph Lizier,

Thank you so much for the toolbox! I am wondering if there is some way to turn off the bias correction for the transfer entropy calculation with KSG-based method.

I have been using the toolbox to test some toy models. In one particular case, I tested a model with two autoregressive processs of degree 10, and information transfer from X to Y with a delay of 10. I worked out the transfer entropy analytically and found that the Gaussian method converges very fast (not surprisingly cause the whole process is Gaussian); but the KSG-based method far below the theoretical transfer entropy. I suspect it might be something to do with the bias correction.

Best regards,

Bowen Zheng

Joseph Lizier

unread,

Aug 6, 2022, 3:44:21 AM8/6/22

to JIDT Discussion list

Dear Bowen,

Great to hear that you like JIDT :)

You can't turn off bias correction in the KSG method, it's built into the algorithm from the ground up.

In this situation, yes the linear estimator will be far more data efficient than the KSG, since the model it uses is directly relevant to the interactions here. The KSG method, in being open to any type of relationship, simply requires more data to reliably detect such interactions. You should see it start to converge to the correct answer as you provide more and more data.

If you want to see what it would look like without bias correction, you could try the kernel estimator, however recall that this estimator is very sensitive to the kernel width and it is difficult to set that so well.

But that does suggest another option: the KSG estimator works by fitting the kernel width judiciously, and then plugging the neighbour counts through the digamma function. What you could so is:

Turn on debug mode for the estimator: calc.setDebug(true);
when you make a calculation, this will then print n_xz (source and conditional count), n_yz (target and conditional count) and n_z (conditional count). If you can parse these, then you could put them into a calculation of log( k * n_z / (n_xz * n_yz) ) for each sample, then average that across all samples. This would give you the TE from a kernel estimate with the kernel set to the k nearest neighbours at each point, essentially without additional bias correction.

That may increase the values, but unlikely so much, and turning off bias is not really recommended.

--joe

--
JIDT homepage: https://github.com/jlizier/jidt
---
You received this message because you are subscribed to the Google Groups "Java Information Dynamics Toolkit (JIDT) discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jidt-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jidt-discuss/14a8c55e-daae-499d-a4de-d8fec2aecb38n%40googlegroups.com.

Bowen Zheng

unread,

Aug 6, 2022, 5:14:10 PM8/6/22

to jidt-d...@googlegroups.com

Hi Prof. Lizier,

Thank you so much for the reply. I've attached the dataset I simulated for your reference. The dataset has two columns and 100000 samples. The underlying dynamics are two autoregressive processes simulated as follow:

x(i) = x(i-10:i-1)*a + sigma*noise_x(i);
y(i) = y(i-10:i-1)*a + x(i-10) + sigma*noise_y(i);

where a is a vector with 10 dims.

I analyze the model by hand and the theoretical value of transfer entropy setting steps of history for both variables to be 10 should be around 2.325. Using the Gaussian method, the result roughly agrees; but the KSG based method is around 0.3, far below the predicted value. I notice that instead of using 10 steps in history, the result computed from KSG and Guassian converge if the history time step is set to 1. It seems the results differ for the two estimators when the embedded dimensions go up and specifically, the transfer entropy from the KSG based method goes down a lot. This is against my intuition since one would think with more information in the history, transfer entropy should not decrease, which leads me to suspect it might be due to bias correction.

Looking forward to your insights on this topic. And again, thank you for a well-documented toolbox!

Best regards,

Bowen Zheng

You received this message because you are subscribed to a topic in the Google Groups "Java Information Dynamics Toolkit (JIDT) discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jidt-discuss/SaFGw8Cmtnw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jidt-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jidt-discuss/CAL81BvR4KxLGWBouhcJ4bQaFYNie1AD1ozUpSaWhQ_Amcbij-A%40mail.gmail.com.

autoregressive.csv

Joseph Lizier

unread,

Aug 7, 2022, 8:09:54 PM8/7/22

to JIDT Discussion list

Hi Bowen,

In short, yes, whilst one may expect TE to increase with the source dimension (for fixed target history dimension), it is specifically the bias correction that makes it decrease. This is because a dimension of 10 for both is making this a relatively high dimensional calculation, so the bias correction will be fairly strong.

Given the amount of available data, it may be more pertinent to test the source embedding from 1:10 to see where you can estimate the max TE (this will trade off the increased information available from the larger source embedding against the associated increase in bias)

--joe

To view this discussion on the web visit https://groups.google.com/d/msgid/jidt-discuss/CAH15sURcGonniBNL47KkVAPVuJ-hGHs7jqwJr9YFL%3D1HM-A0wQ%40mail.gmail.com.

Bowen Zheng

unread,

Aug 8, 2022, 9:52:12 PM8/8/22

to jidt-d...@googlegroups.com

Hi Prof. Lizier,

Thank you for the replies! I will try to implement your suggestions.

Do you have any insights as to why the bias-corrected KSG estimator does not converge to the true value? I noticed that the standard error shrinks faster than bias, which implies that it won't go to the true value asymptotically. It might be because I use data from a time series as inputs, and all samples are not independent.

Best regards,

Bowen Zheng

To view this discussion on the web visit https://groups.google.com/d/msgid/jidt-discuss/CAL81BvQNYazBde4s9ZvOiSutfwSL2e-Y5XKyEJ-DNXMP%2BFSryw%40mail.gmail.com.

Joseph Lizier

unread,

Aug 8, 2022, 9:55:30 PM8/8/22

to JIDT Discussion list

The KSG estimator should converge, since it's mathematically proven to be a consistent estimator. I don't think the claim about error shrinking faster than bias implies that it would not converge. With 10 dimensions on source and target it may simply require a lot more data that you have been providing or are able to provide.

Autocorrelated samples would have an effect as you suggest, as would the reasons I mentioned below.

--joe

To view this discussion on the web visit https://groups.google.com/d/msgid/jidt-discuss/CAH15sUQ3Q9kXEWQGgOufNrARAjQ9WKW1ZWfhchp2pFaC4MAB-Q%40mail.gmail.com.

Reply all

Reply to author

Forward