Training dataset

316 views
Skip to first unread message

Ján Pavlus

unread,
Jan 29, 2024, 5:07:07 PMJan 29
to physionet-challenges
Dear PhysioNet Challenge team,

the model will be trained on your site after submission like in the last years. However, this year the dataset is not constant, but we are generating it by ecg-image-kit using different parameter settings. The amount of possibilities for how to set the image generator to generate the training set brings a new variable to the challenge.
How would be the training dataset generated on your site?

Best
Jan

PhysioNet Challenge

unread,
Jan 29, 2024, 5:10:59 PMJan 29
to physionet-challenges
Dear Jan,

This is a good question, and certainly a new wrinkle for this year's Challenge.

We will share more information about the process soon, but we are planning to provide the same training set (currently the PTB-XL dataset) with the same ECG waveforms for each team, and each team can choose how to use or augment these data to generate different ECG images for training their models.

To help, we are also planning to provide an ECG image without distortions for each ECG waveform (the ECG images that you produce by following the instructions in the example code), but the teams may want to create different or additional images with various distortions and artifacts to better train their models for recovering the ECG waveforms from the ECG images and/or classifying the ECG images; this part of the Challenge will be especially important this year.

More details soon, but please feel free to beta test before then!

Best,
Matt
(On behalf of the Challenge team.)

Please post questions and comments in the forum. However, if your question reveals information about your entry, then please email info at physionetchallenge.org. We may post parts of our reply publicly if we feel that all Challengers should benefit from it. We will not answer emails about the Challenge to any other address. This email is maintained by a group. Please do not email us individually.

Ján Pavlus

unread,
Jan 30, 2024, 10:52:20 AMJan 30
to physionet-challenges
Dear Matt,

thank you for the response. I understand, that we can generate the images by using the arguments combination that we specify. However, the generation of the image took on average (when we use fully random) around the 20s. I try to generate the images on different, machines, also the powerful ones. The PTB-XL dataset has around 20k ECG recordings. This means that one iteration of generation over the dataset would be really long and it will be not possible to do it in submitting code on your machines, due to time limitations in training the submission. Is it then possible to train the model at our site and use our pre-trained model? If yes what is a legal way to do it?

Thank you for your reply,
Jan

Dne pondělí 29. ledna 2024 v 23:10:59 UTC+1 uživatel PhysioNet Challenge napsal:

PhysioNet Challenge

unread,
Jan 30, 2024, 10:55:53 AMJan 30
to physionet-challenges
Dear Jan,

Yes, as you observed, some of the artifacts generated by ECG-Image-Kit are more computational demanding than others. The table in the "Run-time Benchmarks" section of the README file lists estimated run times for different features of the code, and the last run time is comparable to your observations:
https://github.com/alphanumericslab/ecg-image-kit/tree/main/codes/ecg-image-generator#run-time-benchmarks

As you noticed, it would take too long to generate synthetic ECG images with every possible distortion for every record in the training set. Therefore, for the sake of efficiency, we're planning to ask teams to upload the code that they used to prepare and train their models along with pre-trained models that can be further trained using transfer learning on a new dataset. We will share more details soon, but the basic idea is that we will try to make accommodations for the computational demands of generating large numbers of synthetic ECG images while trying to preserve the reproducibility of the code and generalizability of the models.

Best,
Reza, Gari, Matt
(On behalf of the Challenge team.)

Please post questions and comments in the forum. However, if your question reveals information about your entry, then please email info at physionetchallenge.org. We may post parts of our reply publicly if we feel that all Challengers should benefit from it. We will not answer emails about the Challenge to any other address. This email is maintained by a group. Please do not email us individually.

parshuram arotale

unread,
Feb 2, 2024, 3:53:12 PMFeb 2
to physionet-challenges
Hi Matt,
Actually i have query regarding digitizing ECG image and classifying; Do we have to generate synthetic ECG images from PTB-XL dataset with or without distortions and then recover ECG signals from those images and apply DL model to classify(recovered signals)and  classify the synthesized  ECG images (Pretrained models-transfer learning).
Thank you.

Parshuram Aarotale

PhysioNet Challenge

unread,
Feb 2, 2024, 3:55:25 PMFeb 2
to physionet-challenges
Dear Parshuram,

We will share more details about the training process soon, but you are largely able to decide how to train your models, including what kind of models and how many and what kind of ECG images that you want to generate and/or use for training. We are sharing the synthetic ECG image generator as a helpful tool, but you can decide if and how to use it.

We will apply your models to hidden ECG images that are not part of the PTB-XL dataset to reconstruct the ECG waveforms and to classify the ECGs, and we will evaluate how well they perform on the hidden data.

You can decide if you want to submit models for the ECG waveform reconstruction task and/or the ECG classification task. As above, you can largely decide how you approach these tasks as well.


Best,
Matt
(On behalf of the Challenge team.)

Please post questions and comments in the forum. However, if your question reveals information about your entry, then please email info at physionetchallenge.org. We may post parts of our reply publicly if we feel that all Challengers should benefit from it. We will not answer emails about the Challenge to any other address. This email is maintained by a group. Please do not email us individually.

Reply all
Reply to author
Forward
0 new messages