Minimizing edge effects / analyzing signal around the edge of a segment

694 views
Skip to first unread message

Jan W.

unread,
Oct 28, 2014, 1:28:35 PM10/28/14
to analyzingneura...@googlegroups.com

Hi Mike and list,

I’m running into a bit of a unique problem, and maybe you have a proposal to address it.

We’re collecting EEG-TMS data, and we can’t fully clean up the TMS artifact (at least not to an extend that makes me feel comfortable nothing will leak into the pre-artifact period when doing filter-hilbert with a two-way FIR).

However, we’re basically most interested in the data immediately before the onset of the TMS artifact. Filter-hilbert is ill-advised here.

So I’ve been looking into multi-tapered FFT instead. However, I have not enough experience with it to be confident enough to just apply and go forward with it in this particular circumstance. 

Basically, what we’re trying to do is to quantify freq power as close to the TMS pulse as possible. I was thinking about cutting out data period leading up to the onset of the artifact, and then pad them with some artificial data (or just the flipped epoch) at the end [where the tms artifact is], and then using a multi-tapered FFT.

One thing I’m wondering is, if that is indeed the correct approach, how far backwards from the artifact will I have to throw out data because of edge effects?


To formulate the question more broadly (which might be better): How would you extract power info from a segment that is as short as possible, has decent SNR, especially at the very end of the segment (trying to do a single-trial analysis), and has no edge effects/leakage from the period after the end of the segment? Frequency resolution is not very important, and neither is temporal information, really. One value per frequency per segment would be sufficient as far as time goes.


I hope you can discern what I mean from my scattered blips. Thanks for any input!

Cheers,

Jan

Mike X Cohen

unread,
Oct 28, 2014, 2:13:46 PM10/28/14
to analyzingneura...@googlegroups.com
Hi Jan. Sounds like a fun problem ;) 

First of all, I would keep trying to get rid of the artifact. I had tested removing TMS artifacts from EEG data several years ago. If I remember correctly, the magnetic pulse artifact is a massive sharp downward edge (~2 ms) followed by a logarithmic increase back to normal values (~20 ms). I was able to remove it satisfactorily (for piloting anyway; I never ran a full study) using a two-step approach: First, I fit a logarithmic function to the rise and then kept the residual as the valid EEG data; second, I deleted the remaining artifact and Fourier-interpolated across. That means taking the FFT of the ~100 ms before the TMS pulse and the FFT of ~100 ms after the TMS pulse artifact was over, creating two new time series using the power spectra with randomized phases, and then linearly weighting the two time series over time (thus, for example, the reconstructed time series at 25 ms contained 75% of the spectrum before the pulse and 25% of the spectrum after the pulse). Because the phases were shuffled, the time-domain results would be a bit weird and perhaps invalid, but the frequency domain results would be interpretable.

I don't follow the EEG-TMS literature too closely, but perhaps it's an idea to see how other people have gotten rid of this artifact? Or perhaps someone else on the list has some suggestions or success stories? 


Anyway, on to your question. You are correct that the filter-Hilbert method is not appropriate here. Probably the best thing to do would be to put the end of the time window for short-time FFT (STFFT) or multitaper just before the TMS pulse artifact starts. As long as you apply a taper to the data (e.g., Hann), there will be no edge artifact, and there will be no leakage from before the start of the window or after the TMS pulse. As for where to start the time window, that is a trade-off between increasing SNR (longer window) and increasing temporal precision (shorter window). Certainly you'd want a minimum of 3 cycles at your lowest frequency. But because you are interested in the beta band, you can get many cycles in only a few tens of ms. 

As for using STFFT or multitaper, because you mention doing single-trial analyses and because you don't really care about frequency precision, perhaps multitaper might be a better option. 

What I recommend is writing a short script that will loop over time window widths (that is, all time windows end at the TMS pulse artifact, but start at variable times), and then get beta power from STFFT and multitaper and just see what works better. Of course, this exploratory procedure should either be evaluated in a hypothesis-orthogonal way, or you should correct the results for multiple comparisons over time window widths. If you do something like this, it would be interesting to learn about the results! 

I hope that helps, and let me know if you have more questions. 

Mike



--
You received this message because you are subscribed to the Google Groups "AnalyzingNeuralTimeSeriesData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to analyzingneuraltimes...@googlegroups.com.
Visit this group at http://groups.google.com/group/analyzingneuraltimeseriesdata.
For more options, visit https://groups.google.com/d/optout.



--
Mike X Cohen, PhD
mikexcohen.com

Jan W.

unread,
Oct 29, 2014, 5:55:17 PM10/29/14
to analyzingneura...@googlegroups.com
Hi Mike,

first of all, thanks very much for taking the time! It took me a bit to get to this, so sorry for the delay in responding. I now tried some of the things you alluded to, and want to share the results with you and everyone else.

To start off, without giving away too much of what we're doing (this is a collaborative project and I don't know how comfortable the co-authors are with certain information that pertains to the design of the study / nature of the data), let's assume I can't get rid of the TMS artifact using any number of methods that I've tried (ICA, adaptive curve fitting, etc.) owing to the idiosyncrasies of this dataset.

Now, to the problem I talked about, and specifically, to the question of whether to use a hanning tapered (ht) or a multi tapered (mt) FFT.

I wrote a short script that is supposed to compare the raw power estimates of the two methods (I hope that's what it does). I'm sharing this script here.
In order to make the twp comparable, I had to rms-normalize the power within each taper I retained (everything with lambda > .99) to the power within the hanning window. I'm assuming that's a kosher thing to do.

I basically took one segment of our data (one channel from one subject), and applied ht and mt FFT analysis on three different time-windows (1,2,3 cycles).
I tried this for different frequencies ranges (one has to adapt the frequency smoothing coefficient depending on frequency range, lower number for higher freqs).

The results seem to be as follows (all assuming I made no mistakes, which is not a good assumption ;) ):

For frequencies ranges lower than gamma, hanning tapers seem to give higher power estimates (which doesn't make them more accurate, but I'm assuming this kinda scales with snr). For everything > gamma, multitapers take the bacon, at least in this one channel of this one dataset.

For people that don't want to run the code themselves, here are two samples, one for beta, one for gamma:

BETA
Testing Hanning taper FFT vs. Multitapers
Frequency range: 13 to 30 Hz, Frequency smoothing coefficient: 0.4
Found 134 TMS pulses
Starting hanning-tapered FFT...
Mean Power (Hanning taper):
 1 cycle: 41 (mean); 67 (peak)
 2 cycle: 77 (mean); 195 (peak)
 3 cycle: 90 (mean); 305 (peak)
Starting multi-tapered FFT...
 Generated 10 tapers, kept 8
Mean Power (Multi taper):
 1 cycle: 22 (mean); 22 (peak)
 2 cycle: 46 (mean); 66 (peak)
 3 cycle: 62 (mean); 64 (peak)

GAMMA
Testing Hanning taper FFT vs. Multitapers
Frequency range: 30 to 60 Hz, Frequency smoothing coefficient: 0.2
Found 134 TMS pulses
Starting hanning-tapered FFT...
Mean Power (Hanning taper):
 1 cycle: 7 (mean); 7 (peak)
 2 cycle: 4 (mean); 8 (peak)
 3 cycle: 0 (mean); 1 (peak)
Starting multi-tapered FFT...
 Generated 12 tapers, kept 10
Mean Power (Multi taper):
 1 cycle: 4 (mean); 4 (peak)
 2 cycle: 15 (mean); 15 (peak)
 3 cycle: 19 (mean); 30 (peak)


So, I guess, this could give some insights into this problem. The caveats are that this is just one channel in just one subject, and, again, there is no reason to assume that I didn't make a coding mistake. Mike, maybe you can double-check if you find the time.

Maybe I'll do a more systematic investigation a bit later today. I'll report back if I do.

Code and sample data attached.

Cheers,
Jan
To unsubscribe from this group and stop receiving emails from it, send an email to analyzingneuraltimeseriesdata+unsub...@googlegroups.com.
sample_data.mat
hFFTvsMT.m

Jan W.

unread,
Oct 29, 2014, 8:34:52 PM10/29/14
to analyzingneura...@googlegroups.com
Further to the last: I have now analyzed all the data I have for this study in that manner.
I have 49 datastreams overall.

Again, barring a coding mistake or a mistake in the underlying rationale of this comparison is set up, it actually seems to me like the number of cycles is the major determinant of which method 'performs better' (in the sense of 'ends up with higher power estimates').
For one cycle, the hanning taper appears superior, regardless of frequency-band or quantification (highest peak within band or mean across all frequencies within band).

This obviously has to be taken with caution, as this analysis is not very sophisticated (and there could still be something fundamentally wrong with the initial function [see above]).

BETA
1 Cycle HT vs MT (13-30Hz), Mean power:
  t(48) = 1.4878, p = 0.14334
2 Cycle HT vs MT (13-30Hz), Mean power:
  t(48) = -21.8538, p = 1.3591e-26
3 Cycle HT vs MT (13-30Hz), Mean power:
  t(48) = -20.818, p = 1.124e-25
1 Cycle HT vs MT (13-30Hz), Peak power freq:
  t(48) = 22.8785, p = 1.8199e-27
2 Cycle HT vs MT (13-30Hz), Peak power freq:
  t(48) = -6.1164, p = 1.6646e-07
3 Cycle HT vs MT (13-30Hz), Peak power freq:
  t(48) = -6.9917, p = 7.5724e-09

GAMMA
1 Cycle HT vs MT (30-60Hz), Mean power:
  t(48) = 3.7907, p = 0.00041971
2 Cycle HT vs MT (30-60Hz), Mean power:
  t(48) = -23.2025, p = 9.7899e-28
3 Cycle HT vs MT (30-60Hz), Mean power:
  t(48) = -23.1746, p = 1.0323e-27
1 Cycle HT vs MT (30-60Hz), Peak power freq:
  t(48) = 29.3647, p = 2.511e-32
2 Cycle HT vs MT (30-60Hz), Peak power freq:
  t(48) = -9.9583, p = 2.9037e-13
3 Cycle HT vs MT (30-60Hz), Peak power freq:
  t(48) = -10.8552, p = 1.6118e-14

So long.
Jan

Mike X Cohen

unread,
Oct 30, 2014, 4:25:09 AM10/30/14
to analyzingneura...@googlegroups.com
Hi Jan. Very interesting, thanks for sending the code and sample data. I suppose it is sensible that longer windows will produce higher absolute power values. 

On the other hand, it is the change in power values over trials/conditions that is the main dependent variable. It might additionally be insightful to compute the SNR (which can be estimated as mean/std over trials) to have a measure of robustness. I tried this quickly, and the results seemed comparable between Hann and MT, and reasonably similar across window sizes. That's comforting, because these are fairly minor modifications; it would be troubling if there were huge differences between 1 and 2 cycles. 

Below are some comments on your code. 

1) For the initializations on lines 27-30, you can also use the 'deal' function: 
[o1,o2,o3] = deal( zeros(3,5) );
is the same thing as:
o1 = zeros(3,5);
o2 = o1; 
o3 = o1;

2) Your definition of the Hann taper is one point off. It should go 1:n instead of 0:n-1 (Matlab starts counting at 1). Not a big deal, and it won't change any of the results in any meaningful way. 

3) I always forget how eeglab stores latencies, but are you sure that those are latencies in timesteps and not ms? 

4) When defining the frequencies in Hz, you specify ceil(N/2) as the frequency resolution. In general, the frequency resolution is N/2+1 (+1 for the DC). In 2/3rds of your window sizes, this works correctly because N is an odd number. However, if N were even, the frequencies would be very slightly off. It might be good to use the code floor(N/2)+1, which will work whether N is odd or even. This also wouldn't make or break any results (particularly as N increases), but it's good to be a bit more accurate. 

5) For the multitaper, you are correct that the tapers should be normalized to get the results back to microvolts. However, normalizing by the RMS of the Hann taper is not really optimal, because you end up with values greater than 1, in other words, the taper is actually accentuating part of the data. I recommend instead normalizing each taper to its own maximum: 
tsf(:,itp) = tsf(:,itp)./max(tsf(:,itp));


Mike


To unsubscribe from this group and stop receiving emails from it, send an email to analyzingneuraltimes...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages