450K methylation array probe coordinates

554 views
Skip to first unread message

Maria Gutierrez-Arcelus

unread,
Feb 19, 2013, 5:34:06 AM2/19/13
to gen...@soe.ucsc.edu
Dear Genome Browser staff,

Recently I ran into a probe from the 450K methylation array that was wrongly mapped in the genome browser track. After closely looking at the probe I realized this is most likely a problem with all the probes of the array that are in the reverse "R" strand as reported by Illumina. Illumina provides the chromosome (CHR) and position of the methylation site (MAPINFO) but depending on whether the STRAND is R or F one infers the coordinates of the probe differently. It seems that in the browser it was assumed that all of them are in the forward (F) strand.

For example:

Probe cg00000165

In the Illumina table it comes like this (cut out specific columns):
TargetID        GENOME_BUILD    CHR     MAPINFO SOURCESEQ       STRAND
cg00000165      37      1       91194674        AGGATCTGTTAGTACAGTGGCTTTTGATGGAACAGCTGAGGCACACATCG      R

In the genome browser the probe is mapped in chr1:91,194,675-91,194,724. But if one BLATs the sequence of the probe one actually gets:

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND  START    END      SPAN
---------------------------------------------------------------------------------------------------
browser details probe             50     1    50    50 100.0%     1   +   91194626  91194675     50

It is a small detail but it might freak some people out (like me) if they are looking at a specific SNP and wrongly think they didn't filter correctly their data.

BTW: also be careful with the probes of the array that don't start with cg (I think they start with ch). The mapping info is from build 36 and I don't know if the method to infer the coords from the illumina table would be the same.

Best regards,

Maria

Pauline Fujita

unread,
Feb 21, 2013, 9:59:41 PM2/21/13
to Maria Gutierrez-Arcelus, gen...@soe.ucsc.edu
Hello Maria,

Thank you for your interest in the Genome Browser. Unfortunately this
data was provided to us already mapped by the Hudson Alpha Institute
ENCODE group. You will need to refer this matter to the HAIB data
contact for ENCODE (listed in the Credits section of the track
description page). You can see the description page for any track by
clicking on the gray bar to the left of the track in the main display
or by clicking on the track title above its pulldown menu. For this
track the description is here:

https://genome.ucsc.edu/cgi-bin/hgTrackUi?c=chr3&g=wgEncodeHaibMethyl450#TRACK_HTML


If you have further questions about the Browser please feel free to
contact the mailing list again at gen...@soe.ucsc.edu.


Best regards,

Pauline Fujita
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
> --
>
>
>

Kate Rosenbloom

unread,
Feb 25, 2013, 1:40:03 PM2/25/13
to Maria Gutierrez-Arcelus, gen...@soe.ucsc.edu
Hello Maria,

Thanks for reporting this problem and for contacting the HudsonAlpha lab
to follow-up. They have verified that the reverse strand mappings are
indeed in error as you suspected, and will be working to regenerate the
data files.

Cheers,
Kate
---
Kate Rosenbloom
UCSC Genome Bioinformatics

On 2/19/13 2:34 AM, Maria Gutierrez-Arcelus wrote:
> Dear Genome Browser staff,
>
> Recently I ran into a probe from the 450K methylation array that was
> wrongly mapped in the genome browser track. After closely looking at
> the probe I realized this is most likely a problem with all the probes
> of the array that are in the reverse "R" strand as reported by
> Illumina. Illumina provides the chromosome (CHR) and position of the
> methylation site (MAPINFO) but depending on whether the STRAND is R or
> F one infers the coordinates of the probe differently. It seems that
> in the browser it was assumed that all of them are in the forward (F)
> strand.
>
> For example:
>
> Probe cg00000165
>
> In the Illumina table it comes like this (cut out specific columns):
> TargetID GENOME_BUILD CHR MAPINFO SOURCESEQ STRAND
> cg00000165 37 1 91194674
> AGGATCTGTTAGTACAGTGGCTTTTGATGGAACAGCTGAGGCACACATCG R
>
> In the genome browser the probe is mapped
> in chr1:91,194,675-91,194,724. But if one BLATs the sequence of the
> probe one actually gets:
>
> ACTIONS QUERY SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN
> ---------------------------------------------------------------------------------------------------
> browser <http://genome.ucsc.edu/cgi-bin/hgTracks?position=chr1:91194626-91194675&db=hg19&ss=../trash/hgSs/hgSs_genome_5e0c_350410.pslx+../trash/hgSs/hgSs_genome_5e0c_350410.fa&hgsid=326858325> details <http://genome.ucsc.edu/cgi-bin/hgc?o=91194625&g=htcUserAli&i=../trash/hgSs/hgSs_genome_5e0c_350410.pslx+..%2Ftrash%2FhgSs%2FhgSs_genome_5e0c_350410.fa+probe&c=chr1&l=91194625&r=91194675&db=hg19&hgsid=326858325> probe 50 1 50 50 100.0% 1 + 91194626 91194675 50
>
> It is a small detail but it might freak some people out (like me) if
> they are looking at a specific SNP and wrongly think they didn't
> filter correctly their data.
>
> BTW: also be careful with the probes of the array that don't start
> with cg (I think they start with ch). The mapping info is from build
> 36 and I don't know if the method to infer the coords from the
> illumina table would be the same.
>
> Best regards,
>
> Maria
>
> --
>
>
>

Reply all
Reply to author
Forward
0 new messages