gl2geno for pcadapt - name of loci?

198 views
Skip to first unread message

Gabriella Scatà

unread,
Dec 10, 2023, 10:11:46 PM12/10/23
to dartR
Hi,
I have converted my genlight dataset into .lfmm for use with the pcadapt package (outliers detection). Everything seems to be ok, but when I run pcadapt and I get the number of outliers, I only get a list of indexes, not the names of the loci.

Is there a way to get the actual outlier loci names? 
Or at least to know if in the .lfmm file  they are listed in the same order as in the genlight file?

This issue seems to have been reported before --> 
So I wonder if it was ever solved?

Below my code:
I converted my genlight file to geno (.lfmm) with gl2geno (dartR2.7.2) and then reading the file with read.pcadapt

<
my.dataset_PCAdapt_file_gl2geno = read.pcadapt(
    "C:/Users/scatag/my.dataset_geno_dev.lfmm",
    type = c("pcadapt"),
    type.out = c("matrix"),
    allele.sep = c("/"))

my.dataset_PCAdapt_file_gl2geno_PCADAPTres <- pcadapt(
    input =  my.dataset_PCAdapt_file_gl2geno ,
    K = 50) 

qval <- qvalue(
     my.dataset_PCAdapt_file_gl2geno_PCADAPTres$pvalues)$qvalues
  
alpha <- 0.1
 
  outliers_my.dataset_PCAdapt_file_gl2geno <- which(qval < alpha)
  length(outliers_my.dataset_PCAdapt_file_gl2geno) # 96

outliers_my.dataset_PCAdapt_file_gl2geno  #  4  431  759  939 1035 1401 ....
>

It gives me only the index of the outliers, not the name...

I would love some feedback and advice on how to proceed.

Also, anyone would recommend other outlier detection methods? Outflank detects 0 outliers and pcadapt 96 outliers in the same dataset, so I am unsure of which would be more correct. It would be interesting to try a different method and see what the results of the 3rd method are closest to.

Thank you for your help!
Best
Gabriella

Jose Luis Mijangos

unread,
Dec 10, 2023, 11:37:13 PM12/10/23
to dartR
Hi Gabriella,

If you use the "bed" format from plink, as shown in the code below, the order of the loci is the same as in the results of pcadapt.  

library(pcadapt)
library(dartR)
gl2plink(testset.gl,
         plink_path = getwd(),
         bed_file = TRUE,
         outpath = getwd(),
         outfile = "test")
filename <- read.pcadapt("test.bed", type = "bed")
x <- pcadapt(filename, K = 3)
padj <- p.adjust(x$pvalues,method="bonferroni")
alpha <- 0.1
outliers <- locNames(testset.gl)[which(padj < alpha)]

You could try also redundancy analysis: https://doi.org/10.1111/mec.14584  

Cheers,
Luis 

Gabriella Scatà

unread,
Dec 11, 2023, 12:08:57 AM12/11/23
to dartR
Hi Luis,
thank you for your quick reply.

I think I tried in the past to use gl2plink, but unfortunately it calls plink externally from R through the dartR function, and then the program is blocked by my workplace firewall/IT protection system. I will have to look for an alternative solution there...
Unfortunately, I haven't been able to pass that obstacle.

So, the lfmm file obtained with gl2geno does not have loci in the same order as pcadapt then?

Thank you for suggesting RDA, I was actually just looking into that as a possibility. Any specific package you would recommend for RDA?
I found a paper using RDA through the vegan package (https://onlinelibrary.wiley.com/doi/full/10.1111/1755-0998.12906)

Thanks again
Best,
Gabriella

Jose Luis Mijangos

unread,
Dec 11, 2023, 12:29:10 AM12/11/23
to dartR
Hi Gabriella,

The lfmm file obtained with gl2geno has loci in the same order as pcadapt. However, note that when converting a genlight object to lfmm, loci with all missing data are removed. So, use gl.filter.allna() before converting to lfmm. 

Yes, vegan works for RDA. 

Cheers,
Luis 

Gabriella Scatà

unread,
Dec 11, 2023, 12:29:11 AM12/11/23
to dartR
Plus, when I try at the moment I even get stuck before:

gl2plink (with the code you gave me before) gives me the following error:
"Error in names(x) <- value :
  'names' attribute [2] must be the same length as the vector [1]"

What does it mean?

Thanks again!
Gabriella

Gabriella Scatà

unread,
Dec 11, 2023, 12:57:07 AM12/11/23
to dartR
Hi Luis,
just to let you know I just remembered there was already a post addressing this issue "Error in names(x) = value " when using gl2plink --> https://groups.google.com/g/dartr/c/yMzplFdRRKQ/m/zkecNHCjAgAJ
So that problem is fixed..

Now I just need to fix the firewall issue...

But, if I use lfmm file then I don't have to worry about conversion to bed file. So, please tell me if I understand correctly: in order to obtain the same list of loci & in the same order between the genlight & the lfmm file for pcadapt, I just need to do the following:

1.  my_genlight_noNA = gl.filter.allna(my_genlight)
2. gl2geno(my_genlight_noNA,
          outfile = "my_genlight_noNA_lfmm",
          outpath = getwd(),
          verbose =5)

Is this the correct code?

Thanks again!
Best,
Gabriella

Gabriella Scatà

unread,
Jan 21, 2024, 11:26:59 PM1/21/24
to dartR
Hi Luis,
just wondering whether you have a link to a tutorial for conducting RDA with SNPs data?
I only found tutorials in which SNPs data is integrated with environmental variables, but I only have SNPs data...would RDA still be useful to detect outlier loci before continuing with downstream population structure analyses (i.e. DAPC, admixture)?
I am also getting a bit confused with all the various types of RDA analyses/approaches and which one would be more appropriate for detecting outliers with SNPs-only data.
I wold really appreciate some advice or additional resources if you have any.

Thank you!
Best,
Gabriella

Jose Luis Mijangos

unread,
Feb 2, 2024, 12:00:18 AM2/2/24
to dartR
Hi Gabriella,

- Australia's Bureau of meteorology (BOM) has tons of environmental data that is freely available: Australia's official weather forecasts & weather radar - Bureau of Meteorology (bom.gov.au)
- For other methods check this paper: https://doi.org/10.1016/j.gene.2021.146165 
- You could also try outflank (see function gl.outflank) or bayescan (see function gl2bayescan).
- If you have a reference genome and a GFF file, you could filter out coding regions (after blasting your SNPs) 


Cheers,
Luis 

Paige M.

unread,
Jun 27, 2025, 9:20:34 PMJun 27
to dartR
Hello, I can't find a gl.outflank or other type of outflank option in the new dartRverse. Is this type of analysis no longer an option?
Thank you.

Jose Luis Mijangos

unread,
Jul 2, 2025, 2:12:51 AMJul 2
to dartR
Hi Paige,

The function gl.outflank is in dartR.popgen, you can just run the code below:

library(dartRverse)
t1 <- platypus.gl
r1 <- gl.outflank(t1)

Cheers,
Luis 

Paige M.

unread,
Jul 2, 2025, 12:53:34 PMJul 2
to dartR
Hi Luis,
Thank you. So simple! For some reason dartR.popgen was not installed already with my dartRverse!!
Best,
Reply all
Reply to author
Forward
0 new messages