Question about evaluate_model.py, ICD_codes_CI.csv, and create_labels.py

8 views
Skip to first unread message

Allan Moser

unread,
Jun 26, 2026, 4:07:59 PM (9 hours ago) Jun 26
to physionet-challenges
For evaluate_model.py:  Is the required input "prevalence_labels" the file that is included with the training data, "ICD_codes_CI.csv", or is this a file we create with the program, "create_labels.py" using the "demographics.csv" and "ICD_codes_CI.csv" as inputs?

Also, could you explain the statement in the documentation for evaluate_model.py which says that "demographics.csv" will be used for the full training set for prevalence_labels. 

Thank you,
Allan Moser

PhysioNet Challenge

unread,
Jun 26, 2026, 4:26:31 PM (9 hours ago) Jun 26
to physionet-challenges
Dear Allan,

Thanks for these questions.

The required input "prevalence_labels" is the file demographics.csv, which is included with the large version of the training set: https://www.kaggle.com/datasets/physionet/physionetchallenge2026datalargeversion

The evaluate_model script uses this file for the prevalence-based reward metric: it computes the prevalence of cognitive impairment at different ages to provide higher rewards for rarer predictions, such as predictions of cognitive impairment in younger patients with future cognitive impairment diagnoses.

Best,
Audrey
Reply all
Reply to author
Forward
0 new messages