RS SNP ID to Ensembl Gene ID conversion -- only in overlapping genes?

372 views
Skip to first unread message

Andy McKenzie

unread,
Jul 29, 2015, 2:26:00 PM7/29/15
to biomart-users
Hi everyone, 

I'm using biomaRt and converting from RefSnp ID to Ensembl Gene IDs. From what I understand, the SNP IDs are mapped to gene symbols that contain that SNP. Is this correct? Just want to make sure that it is not "nearby", but actually "within". Below is my code. Thanks, 

Andy 

####

library(biomaRt)

mart.snp = useMart("snp", "hsapiens_snp") 
ensembl = useMart("ensembl", dataset="hsapiens_gene_ensembl") 
 
results = getBM(attributes = c("refsnp_id", "ensembl_gene_stable_id"), 
  filters  = "snp_filter", values = SNP_vector, mart = mart.snp)

Thomas Maurel

unread,
Jul 31, 2015, 8:24:54 AM7/31/15
to Andy McKenzie, biomart-users
Dear Andy,

In Ensembl, the variants are mapped on the Transcript level. We annotate variants that overlap a Transcript but also variants that are Upstream or Downstream of a Transcript. If you are only interested in Variants that overlap a Transcript or gene then you can use the “consequence type” filter (called so_parent_name in biomaRt) in the Ensembl snp mart (more information about consequence type on the Ensembl website: http://www.ensembl.org/info/genome/variation/predicted_data.html) and filter out all the variants that are tagged as “upstream_gene_variant” or downstream_gene_variant”.

If you have more questions regarding the Ensembl Variation data, please feel free to email Ensembl helpdesk: help...@ensembl.org.

Hope this helps,
Regards,
Thomas
--
You received this message because you are subscribed to the Google Groups "biomart-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biomart-user...@googlegroups.com.
Visit this group at http://groups.google.com/group/biomart-users.
For more options, visit https://groups.google.com/d/optout.

--
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Andy McKenzie

unread,
Jul 31, 2015, 10:42:13 AM7/31/15
to biomart-users, amc...@gmail.com, mau...@ebi.ac.uk
Great, thank you! One more question: how many base pairs away from a transcript is considered upstream or downstream? 

Andy 

Thomas Maurel

unread,
Jul 31, 2015, 12:00:10 PM7/31/15
to Andy McKenzie, biomart-users
Dear Andy,

We report anything within 5kb of a transcript. I believe you should also exclude the “intergenic_variant” (http://www.sequenceontology.org/browser/current_svn/term/SO:0001628) consequence type as the BioMart filter will return variants tagged with a given term (consequence type) and his children terms (in this case “upstream_gene_variant” and “downstream_gene_variant”).

Hope this helps,
Regards,
Thomas
Reply all
Reply to author
Forward
0 new messages