I am using the ADNI Data from ADNI’s servers to link specific SNPs genomic and diagnostic data. I used plink to pull out the genomic data for each specific SNP. However, when trying to link the diagnostic data with the genomic data, the identities don’t seem to quite match up. I’ll go over every file and process I used in order to obtain this data.
I downloaded the .PLINK formatted for the ADNI Omni2.5M microarray SNP Data.
I thank ran through the file using plink, and looking at specific SNPs in a .raw format.
All this information was then organized into a .csv file, then sorted based on the IID column.
The ADNI_Gene_Expression_Profile_DICT from the ADNI_Gene_Expression_Profile.zip file on the ADNI website provided with the following formula for subject identification:
Subject ID including site; first three numbers indicate site (ZZZ), last four numbers indicate unique subject ID (XXXX, the RID); ZZZ_S_XXXX
This format for ID is consistent with how the ADNI diagnosis data presents itself and is identified in the data dictionary. Upon comparison of the IDs from the two groups:
Neither of the IDs matched one another from each group
Also the RID was a non-specific identifier in the ADNI diagnosis data