About the scoring procedure.

90 views
Skip to first unread message

Phil S.

unread,
Mar 17, 2021, 5:56:42 PM3/17/21
to physionet-challenges
Thanks to the organizers for hosting this interesting challenge. 
The real-world data provided is, as you said, messy and offers some challenges itself.

One problem I have with the data is that not every ECG is annotated for every pathology. 
This does not cause problems during training, but does cause issues during testing/validation.

In order to perform well on the validation data, our model must implicitly or explicitly learn to recognize different databases to not predict pathologies even when they are clearly present.

This will affect the generalization of our algorithms. 

The solution I propose is to mask out pathologies that have not been annotated in a particular database before evaluating the ECG in question.

Without this change, we can only see in the public scorecard whether our code trained successfully or not.

PhysioNet Challenge

unread,
Mar 17, 2021, 6:01:59 PM3/17/21
to physionet-challenges
Dear Phil,

Thank you for your concerns about algorithm generalizability and suggestions to improve it.

As you noticed, some databases do not have some diagnoses or classes. You suggested that classifiers must learn the source of the recordings to perform well on these databases, and you suggested that we should disregard or mask classifier outputs that are not present in a database.

While classifiers that learn the source of recordings may perform better on new recordings from existing databases, they are less likely to perform better on recordings from new databases. Most of the recordings in the test data are from sources that are not represented in the training data, so algorithms that rely on recognizing the databases will fail when they encounter these recordings from new databases.

We specifically structure the Challenge to encourage generalization to new populations, which may have different classes and are likely to have different class distributions. This is reflective of the real world - we develop algorithms, port them to commercial products, and those products are subsequently used on populations that were not represented in the training set. Sharing information about the hidden test data (or using this information to change how we evaluate algorithms) will result in overly-optimistic performance statistics. Our test data include populations that are represented in the training data, as well as a population that is not (although it should be quite similar to much of the training data).

All of the diagnoses or classes in the validation and test data are present in the training data, so the hidden data does not contain any “new” cardiac abnormalities.

Best,
Gari, Matt, and Nadi

(On behalf of the Challenge team.)

https://PhysioNetChallenges.org/
https://PhysioNet.org/

Please post questions and comments in the forum. However, if your question reveals information about your entry, then please email challenge at physionet.org. We may post parts of our reply publicly if we feel that all Challengers should benefit from it. We will not answer emails about the Challenge to any other address. This email is maintained by a group. Please do not email us individually.
Reply all
Reply to author
Forward
0 new messages