
I did a lifotver of the region from GCRCm38 to 39 and it looks like the region has gotten smaller:
Build chr start end width
38 chr14 25875021 26294840 419819
39 chr14 25989500 26014699 25199
So there may have been some issue with alignment. There is a region on Chr 18 that also came up in the lifover file, so maybe there is some sequence similarity that made it hard to place the reads.
Dan
From: rqtl2...@googlegroups.com <rqtl2...@googlegroups.com>
On Behalf Of Karl Broman
Sent: Saturday, April 1, 2023 2:20 PM
To: R/qtl2 discussion <rqtl2...@googlegroups.com>
Subject: [SOCIAL NETWORK] [Rqtl2-disc] Re: CC Variants
Thanks for pointing this out. I'm not sure the cause, but this gap does appear to be present in the Sanger SNP data.
The cc_variants.sqlite file is built from the sanger SNPs, indels, and structural variants VCF files.
The script that does the work is distributed with R/qtl2:
The SNP files for build 38 are at https://ftp.ebi.ac.uk/pub/databases/mousegenomes/REL-1505-SNPs_Indels/
I was grabbing copies that are at JAX, in particular this one: ftp://ftp.jax.org/SNPtools/variants/mgp.v5.merged.snps_all.dbSNP142.vcf.gz
Poking through the copy of the file that I have, it does have that big gap that you identified. I'm not sure whether the problem is in my copy or in the original sanger file, or what the cause might be. I'm downloading fresh copies, but it's going to take a couple of hours.
We might look at the build 39 files to see if there are similar problems.
karl
On Saturday, April 1, 2023 at 12:40:00 PM UTC-5 jessica.ma...@gmail.com wrote:
Hey Karl,
When plotting the snp associations for one of our phenotypes, we noticed a large portion of Chr 14 with no snps.
I downloaded a fresh version of CC_variants.sqlite and when I directly query variants from the database I see that there are no snps for the region 25.87502-26.29484 Mbp. Is this normal? Is it just a region that has very little variation among the founders? Pulling up MGI SNP browser, I do see 1080 SNPs for the DO/CC founders within the 419,821 bp region although the coverage is pretty low in some areas.
This just looks odd in our figures and I haven't come across this before so I am trying to determine whether I have a problem or have a good explanation for it if not. Also, could this be what we are mapping to, some artifact?
Thanks,
Jessica
--
You received this message because you are subscribed to the Google Groups "R/qtl2 discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
rqtl2-disc+...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/rqtl2-disc/f0c83a54-2e8e-41bc-8caf-3330eef6162fn%40googlegroups.com.