EEG Digitization in the official round

Skip to first unread message

Dávid Gajdoš

Jul 5, 2023, 9:54:21 AMJul 5
to physionet-challenges

Dear Challenge team,

 While processing the new dataset, we encountered a problem with digitization and ambiguous approach for individual records and even channels for the same patient in the same record.

We have downloaded the new helper code and used it since it changed in the official round and even reading the file using the WFDB library now has different output than the helper code.

 In the unofficial round, the signals were digitized with ADC gain = 32/uV, i.e. j. the digital unit corresponded to 1/32 = 0.03125 microvolt:

ICARE_0613_67 18 100 30000

ICARE_0613_67.mat 16+24 32/uV 16 0 0 0 0 Fp1-F7

ICARE_0613_67.mat 16+24 32/uV 16 0 0 0 0 F7-T3

ICARE_0613_67.mat 16+24 32/uV 16 0 0 0 0 T3-T5

ICARE_0613_67.mat 16+24 32/uV 16 0 0 0 0 T5-O1

ICARE_0613_67.mat 16+24 32/uV 16 0 0 0 0 Fp2-F8

(Example file in the unofficial round with random set of channels)


In the official round,  it looks as if the ADC gain was set dynamically according to the range signal, e.g.:

0613_077_067_EEG 19 500 1800000

0613_077_067_EEG.mat 16+24 -0.0003181560314260423 16 0 1 1800000 0 Fp1

0613_077_067_EEG.mat 16+24 6.291050794970943e-06 16 0 1 1800000 0 Fp2

0613_077_067_EEG.mat 16+24 -0.0003429266798775643 16 0 1 1800000 0 F3

0613_077_067_EEG.mat 16+24 -0.00020546276937238872 16 0 1 1800000 0 F4

0613_077_067_EEG.mat 16+24 -0.00013665278675034642 16 0 1 1800000 0 C3

0613_077_067_EEG.mat 16+24 -0.0002076699456665665 16 0 1 1800000 0 C4

0613_077_067_EEG.mat 16+24 -0.0001065958640538156 16 0 1 1800000 0 P3

(Same file in official round, same patient/hour/channels as in unofficial round, but now with full 1h record)

 At the same time, the ADC gain values are even negative (sometimes they reach large abs values, e.g. -32768). If the ADC gain is very small, e.g. 0.001, so the digital unit corresponds to 1000 microvolts. Due to the physiological range of EEG amplitudes, the signal processed in this way is unusable.

It seems that in a large number of channels, the ADC gain was set to a relatively small value (in the abs value significantly smaller than 32/uV, respectively smaller than 1/uV), which makes it possible to capture the measured extreme values in case of measurement errors or artifacts, but leads to low accuracy of values in a physiologically interesting range. Moreover, there are significant differences not only between individual EEGs, but sometimes also between individual channels within one EEG. This can have an impact on the calculation of the channel difference when re-referencing the montage.

 In my opinion, it would be more correct to cut off extreme values and choose a uniform ADC gain during digitization, large enough for the required accuracy of values in the physiologically interesting area, since the share of signals degraded in this way in the dataset is significant.


With greetings, David

PhysioNet Challenge

Jul 5, 2023, 9:58:28 AMJul 5
to physionet-challenges
Dear David,

Thanks for sharing these observations.

As you noticed, the updated data capture the full range of each channel, so, they have lower resolution in the physiologically meaningful range of the signal. Unfortunately, we do not have the physical units for the data because some of the data had already been scaled by the data sources, so it is more difficult to choose a uniform ADC gain for the data. This was one of the challenges in preparing the much larger dataset during the hiatus. (The header files should indicate no units instead of missing units.)

As you know, the example code automatically converts the digital values back to analog values using the formula "analog = (digital - baseline) / gain". As you noticed, different WFDB packages result in different values because of different treatments of the ADC zero or baseline value. We would like to update the data to address these issues, but, due to the size of the data, we will not be able to update the data before the end of the official phase.

In this particular case (0613_077_067_EEG), the signal is constant for each channel, and the ADC gain is simply the single (negative) value for the channel. Of course, you may or may not decide that such recordings are not useful for training.

(On behalf of the Challenge team.)

Please post questions and comments in the forum. However, if your question reveals information about your entry, then please email info at We may post parts of our reply publicly if we feel that all Challengers should benefit from it. We will not answer emails about the Challenge to any other address. This email is maintained by a group. Please do not email us individually.
Reply all
Reply to author
0 new messages