Dear Bowen,
Great to hear that you like JIDT :)
You can't turn off bias correction in the KSG method, it's built into the algorithm from the ground up.
In this situation, yes the linear estimator will be far more data efficient than the KSG, since the model it uses is directly relevant to the interactions here. The KSG method, in being open to any type of relationship, simply requires more data to reliably detect such interactions. You should see it start to converge to the correct answer as you provide more and more data.
If you want to see what it would look like without bias correction, you could try the kernel estimator, however recall that this estimator is very sensitive to the kernel width and it is difficult to set that so well.
But that does suggest another option: the KSG estimator works by fitting the kernel width judiciously, and then plugging the neighbour counts through the digamma function. What you could so is:
- Turn on debug mode for the estimator: calc.setDebug(true);
- when you make a calculation, this will then print n_xz (source and conditional count), n_yz (target and conditional count) and n_z (conditional count). If you can parse these, then you could put them into a calculation of log( k * n_z / (n_xz * n_yz) ) for each sample, then average that across all samples. This would give you the TE from a kernel estimate with the kernel set to the k nearest neighbours at each point, essentially without additional bias correction.
That may increase the values, but unlikely so much, and turning off bias is not really recommended.