Hi,
I have been getting acquainted with the ADNI data files and have found answers to most of my questions in the documentation and in this group. However, I just want to make sure I understand the use of CSF data correctly. Before getting access to the data, I asked for the number of participants who are cognitively normal and have CSF amyloid beta data available at baseline. I was kindly answered that the number is 511, but now I am unable to reproduce this figure and I am wondering if there is something I am doing wrong that is making me lose out on n.
Here is what I have tried:
1. Using the adnimerge library and data set in R as follows:
library(ADNIMERGE)
library(dplyr)
dd <- adnimerge
abeta <- dd[complete.cases(dd[ , ('ABETA')]), ]
cnbl <- abeta[which(abeta$DX=='CN' & abeta$VISCODE=='bl'), ]
And I am left with a sample of 369 observations.
2. Using the UPENNBIOMK_MASTER_FINAL_08Nov2023.csv file and following the directions of a recent thread (https://groups.google.com/g/adni-data/c/58y848JRNQQ) here:
csf <- read.csv("UPENNBIOMK_MASTER_FINAL_08Nov2023.csv") #n=7120; multiple rows per subject due to different batches.
csfElecsys <- csf[which(csf$VISCODE=='bl' & (csf$BATCH=='UPENNBIOMK9' | csf$BATCH=='UPENNBIOMK10' | csf$BATCH=='UPENNBIOMK12' | csf$BATCH=='UPENNBIOMK13')), ] #n=954
csfElecsys <- csfElecsys %>%
distinct(EXAMDATE, .keep_all = T) #n=620; keeps first row of multiple baseline batches
ab_dd_upenn <- merge(csfElecsys, dd, by=c("RID", "VISCODE")) #n=620, merge because CSF file does not have diagnosis
ab_cn_upenn <- ab_dd_upenn[which(ab_dd_upenn$DX=='CN'), ] #n=224
And I get 224 observations.
I am not sure how I should be able to get over 500 individuals with CSF Abeta data and normal cognition at baseline. Is merging with ADNIMERGE for diagnosis a mistake?
Thanks in advance!
Toni