inconsistencies between Infinium 450K mapping and hg38:refGene

303 views
Skip to first unread message

Susan Huse

unread,
Jul 31, 2014, 4:40:06 PM7/31/14
to gen...@soe.ucsc.edu
Hi, 

I have been using the Illumina Infinium 450K HumanMethylation BeadChip array which includes a file of annotation based on USCS Reference Genome.
The methylation sites are mapped to: TSS1500, TSS200, 5’UTR, 1stExon, Body, 3’UTR.
We were interested in differentiating the methylation within “Body” as either exon or intron.

I have downloaded genome-mysql.cse.ucsc.edu (user=genome), database=hg38, table=refGene.
I then wrote a simple perl script to look up the coordinate and chromosome of methylated sites based on the Infinium annotations, 
then to sort through the refGene table info and assign to the gene labels above, but rather than Body, it should map to Intron or Exon.

My code appears to work, in that when I hand check the output, the mapping is consistent with the table I just downloaded from UCSC.
Unfortunately, it doesn’t seem to be consistent with the labels associated with the Infinium array.
The Infinium data I am using is: humanmethylation450_15017482_v1-2.csv with a file date of May 27, 2014, so it should be current.

When I have contacted Illumina, they simply say “we got that from UCSC we don’t know anything more than that.”
Is there any way to know why the two do not appear to be consistent?
Am I downloading the correct data?  
Is there something else I should know?

Thanks so much for you time!!

Smiles,
Sue



----------------------------------------- 
Susan M. Huse, PhD
Assistant Professor (Research)
Alpert Medical School
Department of Pathology and Laboratory Medicine
70 Ship Street, Room 505
Brown University
Campus Box G-E5
Providence, RI 02912

Matthew Speir

unread,
Jul 31, 2014, 5:49:03 PM7/31/14
to Susan Huse, gen...@soe.ucsc.edu
Hi Susan,

Thank you for your question about the refGene table and Illumina's 450K HumanMethylation BeadChip array. It could be that Illumina made the array based on the refGene table for the hg19 version of the human reference genome. Have you tried comparing the mappings from the hg19 refGene table to what's in their array?

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group
--


Susan Huse

unread,
Aug 1, 2014, 12:19:18 PM8/1/14
to Matthew Speir, gen...@soe.ucsc.edu
Thank you Matthew!

I now seem to be at least in the general vicinity of getting the same gene information.  Before it felt a bit like a random gene section generator.
There are still some differences that I don’t yet understand, but I can hopefully resolve those on my own.

Thanks again, 
Sue
Reply all
Reply to author
Forward
0 new messages