Dear colleagues,
We are excited to announce the launch of Task 3 of the CohortX Challenge at MICCAI 2026.
Accurate clinical coding is essential for healthcare analytics, clinical cohort selection, medical database retrieval, and medical research. However, mapping medical condition names to standardized
ICD-10-CM codes remains a challenging task due to overlapping symptoms, related diagnoses, and potential differential diagnoses.
In Task 3, participants will develop machine learning models that resolve medical condition names to their correct ICD-10-CM codes. This challenge reflects real-world clinical coding scenarios where distinguishing between similar conditions is
critical.
The dataset explicitly models relationships between conditions and groups of ICD codes:
-
KEEP – ICD-10-CM codes representing the correct diagnosis
-
ASSOCIATION – Related ICD codes that may be linked to the condition but require further screening of medical records for confirmation
-
DIFF – Differential diagnosis codes representing conditions that may be clinically confused with the target condition
Participants are expected to build models capable of identifying the correct codes while distinguishing them from related or confusable ones, addressing a core challenge in automated medical coding.
Healthcare providers often document diagnoses using free-text descriptions, which must later be mapped to standardized ICD-10-CM codes used for clinical documentation, medical data retrieval, epidemiological studies, and medical research. Advancing automated
coding methods can therefore significantly improve clinical NLP systems, healthcare data interoperability, and large-scale medical research.
We look forward to seeing the innovative solutions the community will develop to advance
AI-driven clinical coding and medical NLP.
Best regards,
The CohortX Challenge Organizing Team