CC Variants

79 views
Skip to first unread message

jessica.ma...@gmail.com

unread,
Apr 1, 2023, 1:40:00 PM4/1/23
to R/qtl2 discussion
Hey Karl,
When plotting the snp associations for one of our phenotypes, we noticed a large portion of Chr 14 with no snps. 
SNPAssoc Chr 14.jpeg
I downloaded a fresh version of CC_variants.sqlite and when I directly query variants from the database I see that there are no snps for the region 25.87502-26.29484 Mbp. Is this normal? Is it just a region that has very little variation among the founders? Pulling up MGI SNP browser, I do see 1080 SNPs for the DO/CC founders within the 419,821 bp region although the coverage is pretty low in some areas.

This just looks odd in our figures and I haven't come across this before so I am trying to determine whether I have a problem or have a good explanation for it if not. Also, could this be what we are mapping to, some artifact?

Thanks,
Jessica

Karl Broman

unread,
Apr 1, 2023, 2:19:44 PM4/1/23
to R/qtl2 discussion
Thanks for pointing this out. I'm not sure the cause, but this gap does appear to be present in the Sanger SNP data.

The cc_variants.sqlite file is built from the sanger SNPs, indels, and structural variants VCF files.
The script that does the work is distributed with R/qtl2:

I was grabbing copies that are at JAX, in particular this one: ftp://ftp.jax.org/SNPtools/variants/mgp.v5.merged.snps_all.dbSNP142.vcf.gz

Poking through the copy of the file that I have, it does have that big gap that you identified. I'm not sure whether the problem is in my copy or in the original sanger file, or what the cause might be. I'm downloading fresh copies, but it's going to take a couple of hours.

We might look at the build 39 files to see if there are similar problems.

karl

Karl Broman

unread,
Apr 1, 2023, 2:34:56 PM4/1/23
to R/qtl2 discussion
The snp vcf file I have has an “md5 hash” that matches the value reported for the sanger file.
So I’m working with the same file, and it does seem to have that gap with no variants.

    3ceffa10ee653ef54dc0f3524b7d9a57  mgp.v5.merged.indels.dbSNP142.normed.vcf.gz

karl

Dan Gatti

unread,
Apr 3, 2023, 8:00:06 AM4/3/23
to rqtl2...@googlegroups.com

I did a lifotver of the region from GCRCm38 to 39 and it looks like the region has gotten smaller:

 

Build      chr        start      end        width

38       chr14      25875021   26294840   419819

39         chr14      25989500   26014699   25199

 

So there may have been some issue with alignment. There is a region on Chr 18 that also came up in the lifover file, so maybe there is some sequence similarity that made it hard to place the reads.

 

Dan

 

From: rqtl2...@googlegroups.com <rqtl2...@googlegroups.com> On Behalf Of Karl Broman
Sent: Saturday, April 1, 2023 2:20 PM
To: R/qtl2 discussion <rqtl2...@googlegroups.com>
Subject: [SOCIAL NETWORK] [Rqtl2-disc] Re: CC Variants

 

Thanks for pointing this out. I'm not sure the cause, but this gap does appear to be present in the Sanger SNP data.

 

The cc_variants.sqlite file is built from the sanger SNPs, indels, and structural variants VCF files.

The script that does the work is distributed with R/qtl2:

 

I was grabbing copies that are at JAX, in particular this one: ftp://ftp.jax.org/SNPtools/variants/mgp.v5.merged.snps_all.dbSNP142.vcf.gz

 

Poking through the copy of the file that I have, it does have that big gap that you identified. I'm not sure whether the problem is in my copy or in the original sanger file, or what the cause might be. I'm downloading fresh copies, but it's going to take a couple of hours.

 

We might look at the build 39 files to see if there are similar problems.

 

karl

 

On Saturday, April 1, 2023 at 12:40:00 PM UTC-5 jessica.ma...@gmail.com wrote:

Hey Karl,

When plotting the snp associations for one of our phenotypes, we noticed a large portion of Chr 14 with no snps. 

I downloaded a fresh version of CC_variants.sqlite and when I directly query variants from the database I see that there are no snps for the region 25.87502-26.29484 Mbp. Is this normal? Is it just a region that has very little variation among the founders? Pulling up MGI SNP browser, I do see 1080 SNPs for the DO/CC founders within the 419,821 bp region although the coverage is pretty low in some areas.

 

This just looks odd in our figures and I haven't come across this before so I am trying to determine whether I have a problem or have a good explanation for it if not. Also, could this be what we are mapping to, some artifact?

 

Thanks,

Jessica

--
You received this message because you are subscribed to the Google Groups "R/qtl2 discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rqtl2-disc+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rqtl2-disc/f0c83a54-2e8e-41bc-8caf-3330eef6162fn%40googlegroups.com.

---

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

Karl Broman

unread,
Apr 20, 2023, 4:14:02 PM4/20/23
to R/qtl2 discussion
As Dan pointed out, the big changes in the “new” (as of June, 2020) mouse genome build GRCm39 are on chr 10 and 14, so if you have a QTL on those chromosomes, it would be particularly important to switch to the new build.

I think we finally have all of the necessary materials for QTL analysis with GRCm39, and this morning I wrote a document pointing to them. The key things here:

- If you’re genotypes are from MegaMUGA/GigaMUGA, you can use the mmconvert package to convert your cross2 object to the new build

- There’s a new SQLite database for CC founders variants, using the GRCm39 build. It includes ensembl gene information as well, but if you want to query the genes table, you have to use create_gene_query_func() with additional arguments, as the field names don’t match the defaults.

karl

jessica.ma...@gmail.com

unread,
Jun 20, 2023, 2:44:02 PM6/20/23
to R/qtl2 discussion
Thank you Dan and Karl. I will give the mmconvert a go and try running these with GRCm39.

Jessica

Reply all
Reply to author
Forward
0 new messages