Start and end co-ordinates of MEME motif

53 views
Skip to first unread message

Rajeev R

unread,
Feb 12, 2017, 4:52:11 PM2/12/17
to MEME Suite Q&A
Hello All,

I am using the command line version of MEME-Chip (MEME version 4.11.2) for motif identification for the CHIP-Seq data.  The syntax I have used for the MEME-Chip is below.

``` /shares/bioinfo/installs/meme/bin/meme-chip fea4_rep1_summits_Plus100.fa -dna -meme-mod zoops -meme-nmotifs 50 -nmeme 2665 -meme-p 20 -meme-maxsize 5000000 -meme-minw 6 -o Fea4_MEME```

1) This command line outputs the discovered motifs.  I am interested in identifying the exact co-ordinates of the identified motifs in each of my input sequenced for the downstream analysis.  However, I am confused about the output below.  What is the Start in the output?  I assumed the 'Start' was the start of the motif but its not (I checked my input sequences).  


--------------------------------------------------------------------------------
Motif 2 sites sorted by position p-value
--------------------------------------------------------------------------------
Sequence name            Strand  Start   P-value             Site 
-------------            ------  ----- ---------            ------
9:154835553-154835753        -     24  3.12e-04 CAGGCCGTCG CGTCAC GTCAGTACAG
9:154579056-154579256        -     46  3.12e-04 GTGGGGGGGC CGTCAC GTATCTCGCC
9:152250740-152250940        -     40  3.12e-04 CGTTATCCTA CGTCAC GCCCACGTTT
9:149439460-149439660        -     30  3.12e-04 GTGGCGCTGC CGTCAC CTCGTGTTCC
9:147881238-147881438        -     34  3.12e-04 TCGCCGTCAC CGTCAC CGTCACATTC
9:138040612-138040812        -     78  3.12e-04 CGTTGAGCAG CGTCAC GGGCACCCGC
9:136112286-136112486        +     56  3.12e-04 GTCGACCTGA CGTCAC CGGAGAAAGT

2) From the website example below,  It is supposed to output a gff file with my motif co-ordinated but my FIMO gff file is empty.



Any information on how I can extract the start and end sites of my favorite motif will be very helpful.


Thanks,

Rajiv

CharlesEGrant

unread,
Feb 13, 2017, 3:29:06 PM2/13/17
to meme-...@googlegroups.com
Before MEME is run on your sequences, they are trimmed to their central 100bp. The “Start” position in the MEME output is relative to the truncated sequence. Furthermore, MEME doesn’t parse the coordinates in the sequence header, so the first position in a trimmed sequence is has the coordinate 1. In the output from MEME-ChIP you will find a ‘seqs-centered’ file. These are the sequences randomly sampled from you input set, and trimmed to their central 100bp.

MEME-ChIP always runs FIMO using FIMO's default p-value threshold of 0.0001. It might be that none of the matches passed FIMO’s p-value threshold. You might try running FIMO separately on the MEME and DREME output from MEME-ChIP, using a less stringent p-value threshold. You might also check the E-values reported by MEME and DREME to evaluate how significant the discovered motifs are.  Note that FIMO can parse the coordinates in the sequence header if you use the '--parse-genomic-coordinate' option.

Reply all
Reply to author
Forward
0 new messages