dbSNP files for GRCh38.hg38 version

347 views
Skip to first unread message

Rushiraj Manchiganti

unread,
Jul 23, 2014, 1:48:18 PM7/23/14
to gen...@soe.ucsc.edu
Dear UCSC team,

I intend to identify SNPs from my data using GRCh38.hg38 as reference.  
I searched the site for corresponding dbSNP files which can be used with 
this build for differentiating already known variants.  The latest dbSNP 
files available are for hg19.  However, there is a liftover file at this 
link (http://hgdownload.cse.ucsc.edu/goldenPath/hg38/liftOver/).  So, 
should I use this liftover file and then use the dbSNP files from the 
hg19 assembly?  Or would it be better for me to use hg19 as reference?

Thanking you for your inputs and recommendations.

Regards,

Rushiraj

Jonathan Casper

unread,
Jul 25, 2014, 2:44:29 PM7/25/14
to Rushiraj Manchiganti, gen...@soe.ucsc.edu

Hello Rushiraj,

Thank you for your question about identifying common SNPs in your data. Ultimately it's up to you to decide which approach is going to best serve your needs, but I can give you some additional information. We are currently in the process of releasing a dbSNP track for hg38. Some time next week we expect to have a version of it available on our test server at http://genome-test.soe.ucsc.edu. Please note that this track will not have undergone any of our quality assurance checks, and that additional changes may be made prior to the public release. Whether that is a better tool for your analysis than using the lifted track from hg19 is up to you. Regarding using hg19 instead of hg38, it is certainly true that hg19 has more data mapped to it at this time. Depending on the type of analysis you are doing, that might be helpful. A good start might be to check on whether your SNPs occur in regions that have substantially changed between the hg19 and hg38 assemblies. If so, that would weigh in favor of working with hg38.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu or genome...@soe.ucsc.edu. Questions sent to those addresses will be archived in publicly-accessible forums for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.

--
Jonathan Casper
UCSC Genome Bioinformatics Group



--


Reply all
Reply to author
Forward
0 new messages