Getting genomic coordinates when searching upstream regions

72 views
Skip to first unread message

CharlesEGrant

unread,
Oct 26, 2012, 3:26:13 PM10/26/12
to meme-...@googlegroups.com
The coordinates reported by MAST, FIMO, and GLAM2SCAN are indexes into the sequence being scanned. When scanning whole genomes, these are in fact the genomic coordinates, but when scanning upstream regions they are only indexes into the upstream sub-sequence extracted from the full genome. To translate the start and stop positions reported by MAST, FIMO, and GLAM2SCAN into genomic coordinates, you'll need to first obtain the genomic coordinates for the full upstream sequence containing the match. These can be obtained from the RSAT retrieve sequence tool
    Select the name of the organism from the combo box 
    Select the "All" radio button for "Genes" 
    Set the feature type to "CDS" 
    See the"Sequence type" to "Upstream" 
    Check "Prevent overlap with neighboring genes" 
    Un-check "Mask repeats" 
    Un-check "Admit imprecise positions" 
    Click on the "GO" button. 
    Follow the link to the results page 
    Search for the sequence names from the MAST/FIMO/GLAM2SCAN search results in the list of upstream sequences. The header will contain the genomic coordinates of the upstream sequence. 
    Add the coordinates from the MAST/FIMO/GLAM2SCAN entry to the starting genomic coordinate for the upstream sequence to obtain the genomic coordinates for the match. Note that if the gene is on the reverse strand, RSAT reports the reverse complemented sequence
Reply all
Reply to author
Forward
0 new messages