Data Intepolation

50 views
Skip to first unread message

Douglas Hull

unread,
Mar 20, 2021, 9:15:45 PM3/20/21
to IDTxl
Good Afternoon!

Does the IDTxl package have any properties to handle any missing data or to preform data interpolation? I am running analysis on several sets of data where periodically a single or stretch of values for one or more of the nodes has been recorded as a 0 rather than an actual value. The analysis I am running is MultivariateTE using the KraskovCMI. any guidance on how to best remedy this problem using either built in IDTxl functions or external methods would be greatly appreciated. Thank you all very much for your help!

Joseph Lizier

unread,
Mar 21, 2021, 6:54:32 PM3/21/21
to Douglas Hull, IDTxl
Hi Douglas,

We haven't built in methods to IDTxl that can handle missing data yet.
It is doable, e.g. flagging certain samples as invalid is incorporated on transfer entropy / CMI methods of the underlying JIDT engine, but it needs some thinking through as it's little trickier here than when running a single TE calculation.

In the interim, I would suggest that you pre-parse the data to remove these time steps. First flag for yourself whether each time step includes all valid data, then go back through that and pull out contiguous sub time-series of all valid data, and pull each of these fully valid sub time-series as separate replications in your Data object. Does that make sense? That is a conservative approach that will throw out some data (where e.g. only only node had an invalid measurement but the others may have been usable), but provided you don't have too much invalid data you should still be left with enough to analyse.

@Patricia/Leo/Michael - I should also get you to confirm here that we don't require all replications to have the same length? I don't think we do, from memory some of the stats methods can utilise that if it is the case, but it isn't required

--joe


On Sun, 21 Mar 2021 at 12:15, Douglas Hull <dougla...@gmail.com> wrote:
Good Afternoon!

Does the IDTxl package have any properties to handle any missing data or to preform data interpolation? I am running analysis on several sets of data where periodically a single or stretch of values for one or more of the nodes has been recorded as a 0 rather than an actual value. The analysis I am running is MultivariateTE using the KraskovCMI. any guidance on how to best remedy this problem using either built in IDTxl functions or external methods would be greatly appreciated. Thank you all very much for your help!

--
You received this message because you are subscribed to the Google Groups "IDTxl" group.
To unsubscribe from this group and stop receiving emails from it, send an email to idtxl+un...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/idtxl/63239a30-4610-480e-b996-ecdca0964afen%40googlegroups.com.

p.wol...@gmail.com

unread,
Mar 30, 2021, 11:02:20 AM3/30/21
to IDTxl
Hi Joe, Hi Douglas
Joe, thanks for answering this. The approach you describe is currently the only way to handle missing data and still use IDTxl. There is no functionality in the toolbox that handles missing/invalid data for you.

Unfortunately, we do require all replications to have the same length. It's in theory not necessary but just how IDTxl is set up. We may add support for replications of different lengths at some point but I am afraid there are a few more things on the list before that.

Best, Patricia  

Reply all
Reply to author
Forward
0 new messages