Thank you for reaching out and for your interest in the 2016 PhysioNet Challenge.
As you noticed, the annotation file that the Challenge Organizers shared has a column for a unique patient ID, and this file only has entries for this column for database B in the training set:https://physionet.org/content/challenge-2016/1.0.0/annotations/Online_Appendix_training_set.csv
I understand that you would like the entries in this column for the other databases in the training set so that you can split the training set into "local" training and test sets so that no patient has recordings in both sets.
Please see this link for additional annotations for this dataset:https://iopscience.iop.org/article/10.1088/0967-3334/37/12/2181/data
Unfortunately, the unique patient IDs are not available for all of the other databases (or they would have been shared in the annotation file). Moreover, while you are (correctly!) trying to reduce leakage of information from the test set, one still cannot make principled comparisons of performance between models tested on a "local" test set and models tested on the actual test set. Fortunately, no patient has recordings in both the actual training and test sets. While the test set is not available, we may be able to score your code on the test set. Please see this link for more information and contact us by email (info at physionetchallenge.org
) if you are interested:https://moody-challenge.physionet.org/faq/#run-code
(On behalf of the Challenge team.)
Please post questions and comments in the forum. However, if your question reveals information about your entry, then please email info at physionetchallenge.org
. We may post parts of our reply publicly if we feel that all Challengers should benefit from it. We will not answer emails about the Challenge to any other address. This email is maintained by a group. Please do not email us individually.