Weneed to figure out the label mappingbetween two clusters.It describes how the labels betweentwo clusterings are related to eachother. e.g. when A assigns a label 1to some object, what is the most likelylabel assigned by B.
The goal of this research was to gain a general overview of the present state of agroecology in Hungary, through attaining an understanding of the historical and political contexts in which these developed and currently function, but also to map agroecology related initiatives, research and their networks. Our goal was also to interpret agroecological principles in the Hungarian context, providing a theoretical background for future research and cooperation. Since the mapping would serve as a basis for advancing agroecology in Hungary, the research was also aimed at apprehending in detail the main drivers and challenges that the different actors and networks are facing.
The report finds that Hungary, considering a transformation towards agroecology, is well situated with its history and present richness of actors all over the country. Still, any transformation will only happen if the actors cooperate formally, and therefore can advocate for agroecological transition in a coordinated manner. Agroecology provides a desirable policy objective with the potential to mobilise farmers and other people working in agriculture, researchers, activists and consumers for a common goal: to create a regenerative, socially just, healthy food system in Hungary. As agroecology advances in Europe and the world, Hungarian initiatives could benefit from projects that connect them to similar international partners.
Data harmonisation is essential in real-world data (RWD) research projects based on hospital information systems databases, as coding systems differ between countries. The Hungarian hospital information systems and the national claims database use internationally known diagnosis codes, but data on medical procedures are recorded using national codes. There is no simple or standard solution for mapping the national codes to a standard coding system. Our aim was to map the Hungarian procedure codes (OENO) to SNOMED CT as part of the European Health Data Evidence Network (EHDEN) project.
We recruited 25 professionals from different specialties to manually map the procedure codes used between 2011 and 2021. A mapping protocol and training material were developed, results were regularly revised, and the challenges of mapping were recorded. Approximately 7% of the codes were mapped by more people in different specialties for validation purposes.
We mapped 4661 OENO codes to standard vocabularies, mostly SNOMED CT. We categorized the challenges into three main areas: semantic, matching, and methodological. Semantic refers to the occasionally unclear meaning of the OENO codes, matching to the different granularity and purpose of the OENO and SNOMED CT vocabularies. Lastly, methodological challenges were used to describe issues related to the design of the above-mentioned two vocabularies.
The challenges and solutions presented here may help other researchers to design their process to map their national codes to standard vocabularies in order to achieve greater consistency in mapping results. Moreover, we believe that our work will allow for better use of RWD collected in Hungary in international research collaborations.
Secondary use of routinely collected health data has become essential in healthcare research. Real-world data (RWD) are playing an increasingly important role in generating real-world evidence for regulatory decisions about marketing approval of drugs [1, 2] or in health technology assessments to support reimbursement decisions.
The main sources of RWD are claim databases and electronic medical records (EMRs). Although both data sources have the advantage of containing structured, easy-to-process information about diagnoses and procedures, their use has several challenges in international research collaboration. There are differences in the database structure, design and content [3]. In addition, multinational data collection and analysis is limited because of differences in coding systems across countries for which internationally valid mappings do not always exist [4].
The need for data harmonization has been recognized in the EU. The European Health Data Evidence Network (EHDEN) [4] and also the Data Analysis and Real World Interrogation Network (DARWIN EU) [5] adopt common data models and establish federated data networks for research collaboration. The open-source Observational and Medical Outcomes Partnerships Common Data Model (OMOP CDM) [6] standardizes the data structure, format, and terminology of datasets from various sources, vendors and countries. This enables the application of common analysis codes through a federated data network where only codes and aggregated results are shared but not the data [7].
In many countries medical conditions are coded using national versions of ICD-10. The recently published new version of the WHO International Classification of Diseases (ICD-11) will lead to much more meaningful and detailed coding of patient data in the future [8]. However, procedures are usually represented with country-specific codes. Procedure codes are influenced by different healthcare and reimbursement systems. Therefore, they tend to show more heterogeneity than expected. Thus, mapping local procedure codes to harmonized standards is necessary to facilitate research collaboration.
The Hungarian OENO codes were derived from the International Classification of Procedures in Medicine (ICPM) codes published by WHO in 1978 [11]. The development of the ICPM was practically stopped by the WHO in 1989 [12], so the Hungarian codes were no longer linked to it. Since then, several new codes have been added and the codes have been developed mainly to support the reimbursement of fee-for-service and diagnosis-related groups [13, 14]. As a result, the coding system is tailored to the medical practice and insurance system in Hungary and is not linked to other commonly used international standards.
Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), maintained by SNOMED International, is the most comprehensive, multilingual clinical healthcare terminology globally, already mapped to other international standards [15]. In the OMOP system SNOMED CT is the standard terminology for procedures. The current study aims to map the Hungarian procedure codes to standard OMOP concepts and to present the main challenges of mapping and how they could be overcome.
Around 5,500 inpatient and 3,100 outpatient procedure codes have been introduced in Hungary [9, 10], but some are not in use anymore mostly because the procedures have become outdated. To optimize and join the efforts of the universities participating in the project, we focused on mapping those codes that were recently in use at the institutions. We excluded those procedure codes from mapping that were not used by any of the two universities between 2011 and 2021.
Multiple clinical specialties could use some of the codes. For validation purposes, these codes (approx. 7%) were allocated in two or three specialty groups for mapping. Then the results were checked to test the consistency of their mapping.
In order to support the prioritization of the mapping exercise the overall frequency, the frequency in outpatient departments and the frequency in inpatient departments were calculated and shared with those involved in the mapping.
The main target vocabulary was SNOMED CT [15] codes in the Procedure domain. The domains are the modalities of the code in the OMOP CDM. Since the OENO codes are used not only for medical interventions but also for diagnostic tests and medical devices, Observation, Measurement, and Device domains were also used for mapping. For some of the codes other standard vocabularies of OMOP CDM were found more suitable, such as Logical Observation Identifiers Names and Codes (LOINC) [16], RxNorm [17], and HemOnc [18]. Procedure descriptions in some cases include information on conditions, but these were not included in the target domains because we wanted to avoid having multiple, potentially contradictory, information on the diagnosis. The Hungarian diagnosis vocabulary (BNO) is used explicitly for coding diagnoses, therefore, that was considered as the primary source of information for diagnoses and mapped to SNOMED CT.
The mapping project was carried out by a large research team (including PhD students, resident physicians and one specialist doctor). We developed a mapping protocol based on an Australian mapping guideline [19], Observational Health Data Sciences and Informatics (OHDSI) materials [20, 21] and with the support of an international medical code expert. For outpatient codes, we used the outpatient code rulebook [9], which contains coding rules and - in most cases - code definitions. We used these as a guide to understanding the content of the code. For inpatient codes, only the coding rules were available, no precise definitions for the codes were accessible. We pilot-tested the mapping protocol on 50 random codes with three core team members and refined it afterward. The mapping protocol can be found in the appendix (see Additional file 1).
When mapping from the source to the target vocabularies, we primarily aimed to find a target concept equivalent to the source in meaning. In these cases, we noted that this was an equivalent match. In the instances when there was no equivalent target concept, we mapped to a wider target concept and marked it as a wider match.
During the matching we had to decide whether to use only pre-coordinated concepts or post-coordinated expressions. Pre-coordinated concepts are usual concepts in one of the vocabularies, while in post-coordinated expressions, there is a relationship between more concepts; therefore, together they mean more. SNOMED CT is configured to make post-coordinated expressions, but in OMOP CDM they are not usable for procedures. There is no possibility to establish a relationship between two procedures or a procedure and an anatomic site. Therefore, we used only pre-coordinated concepts and mapped them to a wider concept in the absence of an equivalent concept.
3a8082e126