Question about evaluate_model.py, ICD_codes_CI.csv, and create_labels.py

72 views

Skip to first unread message

Allan Moser

unread,

Jun 26, 2026, 4:07:59 PMJun 26

to physionet-challenges

For evaluate_model.py: Is the required input "prevalence_labels" the file that is included with the training data, "ICD_codes_CI.csv", or is this a file we create with the program, "create_labels.py" using the "demographics.csv" and "ICD_codes_CI.csv" as inputs?

Also, could you explain the statement in the documentation for evaluate_model.py which says that "demographics.csv" will be used for the full training set for prevalence_labels.

Thank you,
Allan Moser

PhysioNet Challenge

unread,

Jun 26, 2026, 4:26:31 PMJun 26

to physionet-challenges

Dear Allan,

Thanks for these questions.

The required input "prevalence_labels" is the file demographics.csv, which is included with the large version of the training set: https://www.kaggle.com/datasets/physionet/physionetchallenge2026datalargeversion

The evaluate_model script uses this file for the prevalence-based reward metric: it computes the prevalence of cognitive impairment at different ages to provide higher rewards for rarer predictions, such as predictions of cognitive impairment in younger patients with future cognitive impairment diagnoses.

Best,

Audrey

Reply all

Reply to author

Forward

0 new messages