Announcement: TUH EEG Seizure v1.5.1

244 views
Skip to first unread message

Joseph Picone

unread,
Apr 5, 2020, 9:53:10 PM4/5/20
to nedc_tuh_eeg, nedc...@googlegroups.com
v1.5.1 of the TUH EEG Seizure Detection (TUSZ) Corpus is now available.
A summary of the corpus and details about the annotations can be found here:

https://www.isip.piconepress.com/projects/tuh_eeg/downloads/tuh_eeg_seizure/v1.5.1/_AAREADME.txt

https://www.isip.piconepress.com/projects/tuh_eeg/downloads/tuh_eeg_seizure/v1.5.1/_DOCS/seizures_v33r.xlsx

In this release, we have carefully checked the accuracy of the dev and
eval annotations. Within a couple of months we will release v1.6.0 which
will include revised training data annotations following the same
process we used for v1.5.1.

As always, feel free to email any questions you might have to
he...@nedcdata.org.

If you have previously downloaded v1.5.1 using rsync, please go ahead
and update your copy by running rysnc again.

Best regards,

Joe Picone

P.S. The edf files and summary files released on April 1 have not
changed. We simply updated some documentation related to the corpus.

Joseph Picone

unread,
Apr 5, 2020, 11:52:02 PM4/5/20
to nedc_tuh_eeg, nedc...@googlegroups.com
We have also made available our 26-dim linear frequency cepstral
coefficient (LFCC) feature vectors. These features are described in this
paper:

Harati, A., Golmohammadi, M., Lopez, S., Obeid, I.
and Picone, J. (2015). Improved EEG Event
Classification Using Differential Energy. Proceedings
of the IEEE Signal Processing in Medicine and Biology
Symposium (pp. 1-4). Philadelphia, Pennsylvania, USA.


https://www.isip.piconepress.com/publications/conference_proceedings/2015/ieee_spmb/denergy/paper_v04.docx

A description of how the features are stored is attached below.

To acquire these files, just re-run rsync:

rsync -auxvL
nedc_t...@www.isip.piconepress.com:~/data/tuh_eeg_seizure/v1.5.1 .

You will see a new directory named "feats":

nedc_001_[1]: p
/home/nedc_tuh_eeg/home/nedc_tuh_eeg/data/tuh_eeg_seizure/v1.5.1
nedc_001_[1]: d
total 79
drwxrwxr-x 5 picone isip 7 Apr 5 23:14 ./
drwxrwxr-x 6 root 1001 6 Mar 29 03:01 ../
-rw-r--r-- 1 picone isip 9678 Mar 31 15:47 _AAREADME.txt
drwxrwxr-x 2 picone isip 10 Apr 5 18:44 _DOCS/
-r--r--r-- 1 picone isip 1868 Mar 29 18:23 _INSTRUCTIONS.txt
drwxrwxr-x 4 picone isip 4 Apr 5 20:23 edf/
drwxrwxr-x 4 picone isip 4 Apr 5 23:14 feats/

that contains the features stored in the same directory structure as the
edf files.

As always, let us know if you have questions.

-Joe

==========
The features are stored in binary data files which contain this information:

// (1) number of rows (number of channels) (4-byte int)

// (2) number of cols (number of frames) (4-byte int)

// (3) channel #0, frame #0:

// no. of features (4-byte int)

// features (4-byte float)

// channel #0, frame #1:
// no. of features (4-byte int)

// features (4-byte float)

// channel #0, last frame:
// no. of features (4-byte int)

// features (4-byte float)
...
// channel #1, frame #0:
// no. of features (4-byte int)

// features (4-byte float)
...

Running "od -i" and "od -f" on a file reveals:

nedc_000_[1]: od -i -N 12
/data/isip/data/tuh_eeg_seizure/v1.5.1/feats/dev/01_tcp_ar/002/00000258/s003_2003_07_22/00000258_s003_t000.raw

0000000 22 2340 26
^^^ ^^^^ ^^
no. channels no. frames no. feats

nedc_000_[1]: od -f -j12 -N104
/data/isip/data/tuh_eeg_seizure/v1.5.1/feats/dev/01_tcp_ar/002/00000258/s003_2003_07_22/00000258_s003_t000.raw
0000014 13.401839 2.1727476 0.32462516 0.69885826
0000034 -0.3026817 0.16332078 -0.088797666 0.06613201
0000054 2.2721477 -0.27506906 -0.052756462 0.12254754
0000074 -0.07974434 0.048647482 -0.01573051 0.0068437094
0000114 -0.0077251447 0.1230895 -0.03802236 -0.0358741
0000134 -0.018982377 0.014988172 -0.00493749 -0.0035373378

These are the 26 elements of the first feature vector for the first channel.

nedc_000_[1]: od -i -j116 -N4
/data/isip/data/tuh_eeg_seizure/v1.5.1/feats/dev/01_tcp_ar/002/00000258/s003_2003_07_22/00000258_s003_t000.raw

0000164 26
0000170
nedc_000_[1]: od -f -j120 -N4
/data/isip/data/tuh_eeg_seizure/v1.5.1/feats/dev/01_tcp_ar/002/00000258/s003_2003_07_22/00000258_s003_t000.raw

0000170 13.078412
0000174
nedc_000_[1]: od -f -j120 -N104
/data/isip/data/tuh_eeg_seizure/v1.5.1/feats/dev/01_tcp_ar/002/00000258/s003_2003_07_22/00000258_s003_t000.raw

0000170 13.078412 1.9224968 2.1705375 0.31227323
0000210 -0.45942664 0.20104696 0.10449887 -0.11398636
0000230 2.5109947 -0.35111377 -0.124504656 0.08458278
0000250 -0.049767997 0.038772505 -0.022805186 0.0020929435
0000270 -0.002347174 0.11078123 -0.073626794 -0.03549913
0000310 -0.054850843 0.026184069 -0.013303388 0.013030112
0000330 -0.00019972246 0.0035286713

This is the dimension and elements of the second feature vector for
channel no. 0. Every feature vector written is preceded by the dimension
of the vector.

============
Subject: Announcement: TUH EEG Seizure v1.5.1
Date: Sun, 5 Apr 2020 21:52:54 -0400
From: Joseph Picone <joseph...@gmail.com>
To: nedc_tuh_eeg <nedc_t...@googlegroups.com>
CC: nedc...@googlegroups.com

Joseph Picone

unread,
Apr 7, 2020, 4:42:02 AM4/7/20
to nedc_tuh_eeg, nedc...@googlegroups.com, Help
> v1.5.1 of the TUH EEG Seizure Detection (TUSZ) Corpus is now
> available.
>

A number of people have been asking questions about how to rsync the
data. If you have access to a Unix machine or Mac, set your current
working directory to the place you want to store the data on your local
machine, and then execute this command:

rsync -auxvL
nedc_t...@www.isip.piconepress.com:~/data/tuh_eeg_seizure/v1.5.1 .

This will download the complete v1.5.1 release.

(all on one line). Note that the " ." (space followed by dot) at the end
of the command is very important. I have attached a text file that has
the complete command - you can cut and paste from that.

If you are on a Windows box, you might want to install a Linux subsystem:

https://www.dataquest.io/blog/tutorial-install-linux-on-windows-wsl/

Windows has become much more Linux-friendly in recent years. There are
several other options including Powershell. But you definitely want to
use rsync.

Please note you can run rsync as many times as you want. It doesn't hurt
anything. Rsync will only download those files that have changed.
Eventually, it won't download anything because your copy is identical to
ours.

Finally, some of you are reporting time synchronization problems. In
this case, you might want to use the checksum option in rsync:

rsync -acvL
nedc_t...@www.isip.piconepress.com:~/data/tuh_eeg_seizure/v1.5.1 .

This will run slower, but is very precise in the way it decides if your
copy of the files match our copies.

As always, let us know if you have questions. You can send an email to
he...@nedcdata.org and we will try to answer your questions promptly.

-Joe Picone
x.txt

Joseph Picone

unread,
Apr 23, 2020, 11:05:17 PM4/23/20
to nedc_tuh_eeg, nedc...@googlegroups.com
> We have also made available our 26-dim linear frequency cepstral
> coefficient (LFCC) feature vectors.

In addition to features, we have released EDF files downsampled to 250
Hz. The data is located here:


https://www.isip.piconepress.com/projects/tuh_eeg/downloads/tuh_eeg_seizure/v1.5.1/edf_resampled/

and available from our rsync server.

-Joe

Joseph Picone

unread,
May 27, 2020, 5:13:56 PM5/27/20
to nedc_tuh_eeg, nedc...@googlegroups.com
TUH EEG Seizure (TUSZ) v1.5.2 is now available from our web server:

https://www.isip.piconepress.com/projects/tuh_eeg/downloads/tuh_eeg_seizure/v1.5.2/

v1.5.1, which was used in the Neureka 2020 Epilepsy Challenge
(https://neureka-challenge.com/), was a version of the database in which
we did an additional pass on the eval and dev annotations to make sure
they were accurate.

In v1.5.2, we have completed another pass on the training data as well.
So the entire TUSZ Corpus has been carefully reviewed.

Our next scheduled release, which will be v1.6.0, will consist of new
data collected from 2016 to mid-2019. We expect to make this release by
the end of the summer.

We also expect to release v2.0 of TUH EEG - the master database that
contains all EEGs we have collected through 2019.

Finally, the Neureka 2020 Epilepsy Challenge recently completed. The
leaderboard can be viewed here:

https://neureka-challenge.com/results/

Congratulations to the two winning teams:

First Place: Biomed Irregulars, KU Leuven, Belgium
Second Place: Neurosyd, University of Sydney, Australia

As always, let us know if you have any questions.

-Joe



Joseph Picone

unread,
May 28, 2020, 1:30:03 AM5/28/20
to nedc_tuh_eeg, nedc...@googlegroups.com
We have released an updated version of the scoring software that was
used in the Neureka 2020 Epilepsy Challenge:

https://www.isip.piconepress.com/projects/tuh_eeg/downloads/nedc_eval_eeg/v3.3.3/

This software is popular because the interface is pretty simple - a
single hypothesis file. It is currently very specific to the Neureka
challenge.

We have a more general version of the software that we release. An
updated version of this software will be released shortly.

-Joe

Joseph Picone

unread,
Aug 21, 2020, 5:07:35 PM8/21/20
to nedc_tuh_eeg, nedc...@googlegroups.com
v4.0.0 of our scoring software, nedc_eval_eeg, is available here:

https://www.isip.piconepress.com/projects/tuh_eeg/downloads/nedc_eval_eeg/nedc_eval_eeg_v4.0.0.tar.gz

It is also available from our rsync server.

This version integrates the competition version of the scoring software
with our regular distribution. The NIST software, which is used to
implement the ATWV metric, is now optional.

The package includes an extensive set of regression tests and some
simple examples of how to use it.

Let us know if you have any questions.

-Joe
Reply all
Reply to author
Forward
0 new messages