Database: hg19 Primary Table: snp150Common Row Count: 14,810,671 Data last updated: 2017-05-06
Database: hg19 Primary Table: snp147Common Row Count: 14,815,821 Data last updated: 2016-07-29
There is a difference of ~5000 snps.
Could you please explain the drop in numbers we see here?
Thanks,
Savvy
Hello Savvy,
Thank you for using the UCSC Genome Browser and your inquiry.
One of our engineers has shared that the drop in rs IDs numbers is due to an error in the dbSNP database dump files that we downloaded.
In dbSNP b147, some variants submitted by 1000 Genome Phase 3 were not merged into pre-existing rs#s but were instead assigned new rs#s. In b150, those variants were merged into the pre-existing rs#s -- but the allele frequency database table still used the b147 rs# IDs for over 16,000 of them instead of the pre-existing/b150 rs# IDs. Therefore we were unable to associate the allele frequencies with those 16,000+ variants; since those variants appeared to have no allele frequency info, we could not detect them as common.
Thank you again for sharing this issue. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly-accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
Jairo Navarro
UCSC Genomics Institute
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAFx6sdjb%3DOSpLePOhz%3Df417BzxuCfvLHYaOnc2en4_3eo6qAAw%40mail.gmail.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.