Auto-dependency

88 views
Skip to first unread message

Luca Castri

unread,
Jul 11, 2022, 4:45:57 PM7/11/22
to IDTxl
Hi,

from my understanding, in the first step the algorithm looks for past values of the target that produce an effect to it at the specified time.

  • Is that a measure of auto-dependency of the target variable. 
  • If yes, is there a way to have this measure in the output result?

Thank you,
Luca

p.wol...@gmail.com

unread,
Nov 28, 2022, 11:26:17 AM11/28/22
to IDTxl
Hi Luca,
apologies for the delay. Yes, the first statement is correct, the algorithm first optimizes the target's past state such that it is maximally informative about the target variable at some point in time t.
Unfortunately, right now it is not possible to see this value. What you can do out of the box, however, is to estimate the active information storage (AIS) for each node in your network (there is a demo script on how to do this). This will give you the same information.
Best,
Patricia

Yorgo Hoebeke

unread,
Dec 1, 2022, 4:09:01 AM12/1/22
to IDTxl
Dear Patricia,

I did what you suggested, but I encountered a problem: I noticed the target's selected lags sometimes differ between the mTE and the AIS results.
I used the default settings for the number of permutations. Is it possible that the difference in selected past lags of the variable is due to variability induced by the permutation test for significance? (or is it due to something else?)

Best,
Yorgo

p.wol...@gmail.com

unread,
Dec 9, 2022, 2:28:16 AM12/9/22
to IDTxl
Hi Yorgo,
thanks for making me aware that the last reply did not get posted to the mailing list. So here it is:

Dear Yorgo,
 
yes, the permutation test is one source of non-determinism between two runs. Also the candidate set used to optimize the target embedding may slightly differ between the two analyses. For TE analysis, the index of the current value used for estimation (target value Y_n in TE estimation) is determined based on the user-defined source and target lags. Consider the following Figure:
 
 
Which index in the data is used for n, depends on the min and max source and target lags. For example, if the source max lag is 50 and the target max lag is 30, te first current value will be the sample in the target process with index 50. If the source max lag is 20 and the target max lag is 23, the current value index will be 23. So the first sample considered as the current value in the first replication is determined by the maximum over the two max lags.
 
In your case, if you run your TE analysis with a larger max_lag for the source than for the target, the estimation will use slightly different samples than what is used in AIS estimation, where you use the target's max lag only.
 
Best,
Patricia

Patricia Wollstadt

unread,
Dec 9, 2022, 2:39:49 AM12/9/22
to IDTxl
Hi Yorgo,

That is right, the generation of the permutations for the statistical test adds non-determinism. You can set a fixed seed when initialising the Data object, e.g., data = Data(seed=0). This controls the generation of surrogate data and should make it repeatable if you re-initialise the data object for your AIS analysis with the same random seed as for the mTE analysis.

Also, if you are using the JIDT Kraskov estimator a little random noise is added. Unfortunately, we can not use the random seed for that at this point. However, this should only have a minor effect.

Regarding your other questions:

- In the case the omnibus TE for a target is higher than the sum of the TEs of the selected sources to that target, can we infer synergistic effects are occurring between the sources?

In theory yes, but I would not trust the actual, estimated values due to estimator bias. Estimation bias is dependent on the dimensionality of the variables, so the omnibus TE is estimated in a more high-dimensional space, which leads to a different bias than for the estimation of the individual TEs. To investigate whether you have a synergistic effect, I would estimate the PID between sources and the target value.

The same holds for the second question. 

Hope this helps.
Best,
Patricia

Anfang der weitergeleiteten Nachricht:

Von: Yorgo Hoebeke <yorgoh...@gmail.com>
Betreff: Aw: Re: Auto-dependency
Datum: 7. Dezember 2022 um 12:55:40 MEZ
An: Patricia Wollstadt <Patricia....@gmx.de>

Dear Patricia,

Thank you very much for your answer. It is very helpful and clear.

In my analyses, I used the same value for max lag for the target and sources, so is it correct to infer that within this context the only source of variation comes from the permutation test?
If so, increasing the number of permutations could help at the cost of computing time? 

I have an additional question regarding the interpretation of the omnibus TE and the TE from selected sources (unrelated to IDTxl, I admit):
- In the case the omnibus TE for a target is higher than the sum of the TEs of the selected sources to that target, can we infer synergistic effects are occurring between the sources?
- In the case the omnibus TE for a target is lower than the sum of the TEs of the selected sources to that target, can we infer that there is redundancy in the information provided by the sources?
(This is my interpretation after reading "Introduction to Transfer Entropy")

Best,
Yorgo

PS: I noticed your answer did not appear on the IDTxl group; I don't know if this was intentional or not, but I wanted to let you know in case it was not.

On Thu, Dec 1, 2022 at 12:08 PM Patricia Wollstadt <Patricia....@gmx.de> wrote:
Dear Yorgo,
 
yes, the permutation test is one source of non-determinism between two runs. Also the candidate set used to optimize the target embedding may slightly differ between the two analyses. For TE analysis, the index of the current value used for estimation (target value Y_n in TE estimation) is determined based on the user-defined source and target lags. Consider the following Figure:
 
 
Which index in the data is used for n, depends on the min and max source and target lags. For example, if the source max lag is 50 and the target max lag is 30, te first current value will be the sample in the target process with index 50. If the source max lag is 20 and the target max lag is 23, the current value index will be 23. So the first sample considered as the current value in the first replication is determined by the maximum over the two max lags.
 
In your case, if you run your TE analysis with a larger max_lag for the source than for the target, the estimation will use slightly different samples than what is used in AIS estimation, where you use the target's max lag only.
 
Best,
Patricia
 
 
 
Gesendet: Donnerstag, 01. Dezember 2022 um 10:09 Uhr
Von: "Yorgo Hoebeke" <yorgoh...@gmail.com>
An: "IDTxl" <id...@googlegroups.com>
Betreff: Re: Auto-dependency
Dear Patricia,
 
I did what you suggested, but I encountered a problem: I noticed the target's selected lags sometimes differ between the mTE and the AIS results.
I used the default settings for the number of permutations. Is it possible that the difference in selected past lags of the variable is due to variability induced by the permutation test for significance? (or is it due to something else?)
 
Best,
Yorgo
 
On Monday, November 28, 2022 at 5:26:17 PM UTC+1 p.wol...@gmail.com wrote:
Hi Luca,
apologies for the delay. Yes, the first statement is correct, the algorithm first optimizes the target's past state such that it is maximally informative about the target variable at some point in time t.
Unfortunately, right now it is not possible to see this value. What you can do out of the box, however, is to estimate the active information storage (AIS) for each node in your network (there is a demo script on how to do this). This will give you the same information.
Best,
Patricia
 
On Monday, 11 July 2022 at 10:45:57 pm UTC+2 lucaca...@gmail.com wrote:
Hi,
 
from my understanding, in the first step the algorithm looks for past values of the target that produce an effect to it at the specified time.
 
  • Is that a measure of auto-dependency of the target variable. 
  • If yes, is there a way to have this measure in the output result?
 
Thank you,
Luca
 
--
You received this message because you are subscribed to the Google Groups "IDTxl" group.
To unsubscribe from this group and stop receiving emails from it, send an email to idtxl+un...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/idtxl/5f44ae98-75ba-40dd-884a-3548a34e0929n%40googlegroups.com.

Yorgo Hoebeke

unread,
Dec 15, 2022, 3:35:37 AM12/15/22
to IDTxl
Hi Patricia,

I hadn't noticed the seed argument! Thanks for pointing it out; I will follow your advice.
And thank you for bringing PID to my attention. I do not think I have more than 4 sources per target so I think I will be able to use the multivariate PID calculator.
(I already glanced over the two papers linked in the documentation of IDTxl; I'm not sure I'll be able to understand all the math but the information PID gives looks really useful).

Thank you for all your answers,
Best,
Yorgo
Reply all
Reply to author
Forward
0 new messages