I am new to using R, and I am very excited to have found this package for my data analysis. I saw a post by KM Tsui back in 2015 asking for assistance called poppr beginner. I have read that post, and I am attempting to Convert a data.frame of allele data to a genind object. I ran two different files, one "testing.csv" an incomplete file which I was able to make a script as follows:
ALO<- read.csv("C:/Users/innap/Documents/R files/testing.csv", head = TRUE, sep = ",")
alos<- df2genind(ALO[, -c(1, 2)],ncode = 3, ind.names = ALO[[1]], pop = ALO[[2]], ploidy = 2)
poppr(alos)
| Idaho
| Oregon
| Total
Pop N MLG eMLG SE H G lambda E.5 Hexp Ia rbarD File
1 Idaho 31 1 1 0.00e+00 0.000 1 0.0 NaN 0.0000 NaN NaN alos
2 Oregon 33 1 1 0.00e+00 0.000 1 0.0 NaN 0.0000 NaN NaN alos
3 Total 64 2 2 1.73e-08 0.693 2 0.5 0.999 0.0387 1 1 alos
after about 8 hours of messing with different commands, something worked, so that's good, but I am not sure if I have the values in there correctly because Ia and rbarD show NaN (probably because it's a very short initial file with only two pops?). I set ncode=3 because in the reading frame my alleles are maked as A/A..and I dont exactly know what -c(1, 2)] is. This type of format I found on a PDF "Reading Genetic Data Files Into R with adegenet and pegas"
I also made another csv file (test2.cvs) modeled after diploid data from this page:
http://grunwaldlab.github.io/Population_Genetics_in_R/Data_Preparation.html. I attempted to run it with the same code and of course it did not work. I would really like to use the second example as is suggested on the website (test2.csv) , but I was not able to find how to make a dfgenind matrix which would allow me to run my data in that format because when the data is read, the "space" for the second allele are read as a different locus. Any ideas on a command to merge them into one locus two alleles?? Is it worth putting in the metadata rows at the top and how do I read them in without messing up the headers?
Also, can you please recommend which format I should use that would be most user friendly (testing.csv or test2.csv) for running standard measures of diversity, Ia and rbarD? I am working with allozymes and a primarily selfing diploid plant. The columns I need to include are: Individual, Range, State, Population, and 26 Loci (will be added later) If anyone can share an example file that would be great! I hope I am at least on the right track.. Thank you!!
Inna P. Smith