Runs of Homozygosity - found 0 ROH

459 views
Skip to first unread message

Larissa Arantes

unread,
Sep 28, 2022, 5:41:33 AM9/28/22
to plink2-users
Hi,
I'm running Runs of Homozygosity analysis using the command line below and I have no error message, but also no ROH recognized.

"plink --bfile $Plinkinput --homozyg --allow-extra-chr"

PLINK v1.90b6.21 64-bit (19 Oct 2020)          www.cog-genomics.org/plink/1.9/
(C) 2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to plink.log.
Options in effect:
  --allow-extra-chr
  --bfile $Plinkinput
  --homozyg

95333 MB RAM detected; reserving 47666 MB for main workspace.
4500623 variants loaded from .bim file.
1 person (0 males, 0 females, 1 ambiguous) loaded from .fam.
Ambiguous sex ID written to plink.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 1 founder and 0 nonfounders present.
Calculating allele frequencies... done.
4500623 variants and 1 person pass filters and QC.
Note: No phenotypes present.
--homozyg: Scan complete, found 0 ROH.
Results saved to plink.hom + plink.hom.indiv + plink.hom.summary .

Can you recognize any errors?
Do you have any recommendations to test?

Thank you very much.
Best,
LSA

Christopher Chang

unread,
Sep 28, 2022, 11:27:34 AM9/28/22
to plink2-users
This is usually due to importing a VCF that has no 0/0 genotypes.  See https://groups.google.com/g/plink2-users/c/weLDmbLp8LQ/m/ETkvQKuRAQAJ .

Larissa Arantes

unread,
Nov 8, 2022, 12:20:35 PM11/8/22
to plink2-users
Dear,
Thank you for your feedback.
I want to identify ROH for a single genome. My VCF file was obtained by mapping the HiFi reads to the genome assembly generated using the same data of the same individual. Therefore, I have no 0/0 genotypes.
Following your recommendation, I generated an extended gVCF with all positions (3 Gb) and now I have plenty of 0/0 genotypes.
I was able to run ROH at Plink using the gVCF, but the sum of the ROH lengths is close to the chromosome size, meaning the entire chromosome is in ROH, which is not true.
I tested different parameter combinations and the number of ROH changes, but the total length is always close to the chromosome size.
Please find the gVCF for one chromosome, the .log and .hom files here:
https://drive.google.com/drive/folders/1epjWMKlhj4VDwmPZYANeaAQ_Kg4-wMKo?usp=sharing
Can you identify any problem in my file that might lead to ROH in the entire genome?
I want to filter the gVCF (by min and max coverage, quality mapping, indels, biallelic positions) and consequently, I will lose some positions. How does Plink deal with missing positions? Does plink consider them missing or invariable positions?
Thank you very much for your support.
Best,
LSA



--
You received this message because you are subscribed to a topic in the Google Groups "plink2-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/plink2-users/h_U4rdfd_vA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to plink2-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/plink2-users/f4cf7ea5-4a39-4d43-85c1-ba463c329c30n%40googlegroups.com.

Larissa Arantes

unread,
Jan 27, 2025, 6:56:12 AM1/27/25
to plink2-users
Hi,

I'm still trying to run Plink --homozyg for a single individual and encountering an issue where the entire chromosome is classified as RoH.
Could you help me understand the potential reasons for this problem?
I’m wondering if this could be related to PLINK’s RoH method requiring allele frequency information for accurate ROH classification.
Thank you in advance for your assistance!

Chris Chang

unread,
Jan 27, 2025, 8:25:55 AM1/27/25
to Larissa Arantes, plink2-users
Plink doesn’t directly use allele frequency in its ROH algorithm.  However, you are responsible for providing a set of variants that are actually variant — the default settings were chosen for genotyping array data where e.g. minor allele frequency was over 5% in the population.

You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/plink2-users/46b320a3-b15e-485a-99b9-767641cf1b3fn%40googlegroups.com.

Larissa Arantes

unread,
Jan 29, 2025, 1:37:42 PM1/29/25
to plink2-users
Thank you for your reply! 
I'm providing a gVCF containing invariant (0/0) and variant (0/1) sites for a single individual, meaning that my variant sites are variant within the individual.
Can PLINK --homozyg deal with this kind of data?
I tested several different parameters, and the number of classified ROH changes, but for all tests, the sum of the ROH length is almost equal to the genome size. So, the results don't seem to be related to the parameter choice.
What could explain this issue? Is PLINK suitable for a single individual ROH analysis?  

Chris Chang

unread,
Jan 29, 2025, 1:47:26 PM1/29/25
to Larissa Arantes, plink2-users
Again, you need to filter down to variants that are present at nonnegligible frequency in a broader population.  Do you have access to any sort of allele-frequency file for the species you're working with?  (For humans, the 1000 Genomes dataset can be used for this purpose, and more recent projects like gnomAD provide this information at larger scale.)

Larissa Arantes

unread,
Feb 3, 2025, 8:08:40 AM2/3/25
to plink2-users
No, I don't have allele frequencies. I'm working with a non-model species.
Could you please explain why Plink roh algorithm needs the allele frequecy?

Chris Chang

unread,
Feb 3, 2025, 10:47:59 AM2/3/25
to Larissa Arantes, plink2-users
Plink —homozyg was developed when genotyping array data was the norm.  With genotyping array data, where the variants all have relatively high minor allele frequencies, if you see 50 homozygous genotypes in a row, that actually tells you something about a possible ROH.

In contrast, if you have a file with genotypes for all ~3 billion human genome positions, 50 homozygous genotypes in a row is totally ordinary and tells you practically nothing about ROH.

Reply all
Reply to author
Forward
0 new messages