How do I use MEME suite for searching for motifs?

3,709 views
Skip to first unread message

James Johnson

unread,
Nov 5, 2015, 6:51:31 PM11/5/15
to meme-...@googlegroups.com
Very generally what options exist for motif discovery, comparison, and searching and where should I look for more information before bothering the MEME team with my very vague questions? 

James Johnson

unread,
Apr 17, 2018, 5:31:42 PM4/17/18
to meme-...@googlegroups.com
Note that this post only covers some of the most commonly used tools in the MEME Suite but then if you want to know about the others it does provides links at the bottom.

Discovery

If you have a set of sequences and you want to discover new motifs you need to use MEME, DREME or MEME-ChIP. MEME can discover more complex motifs than DREME but it requires far more processing resources (see MEME: Dataset size and run time issues ) and for that reason you may need to randomly subsample your dataset (see Tips for using MEME with ChIP-seq data ). DREME discovers lots of short motifs relatively quickly (compared to MEME) and can handle much larger datasets before the runtime becomes intractable. If you happen to have a control sequence set (aka negative sequences) containing motifs you don't want to discover then you can perform discriminative motif discovery with both MEME and DREME. The method for MEME is a little more involved (see How do I perform discriminative motif discovery using the command line version of MEME?). MEME-ChIP is designed to make running MEME and DREME (as well as Tomtom and CentriMo) on ChIP-seq data easy. All you have to do is provide it with a set of sequences which are all the same length (between 300bp and 500bp) which are centered on the ChIP-seq peaks and it will do the rest.

Comparison
If you have an existing motif (ie from MEME, DREME or maybe a consensus sequence) and want to find other similar motifs then you should use Tomtom. Tomtom can take in a file of query motifs and compare them to multiple files containing potentially similar motifs.  Unless you have hundreds of motifs to search then I recommend you use the website version as it can automatically create MEME style motifs to search with from consensus sequences (allowing for IUPAC codes) or frequency/count matrices.

Sequence Search

If you have a motif that you want to find in a set of sequences then you should use FIMO. Note that you can't just scan a genome with a motif an expect that all sites you find are biologically active, because for most part chance matches will swamp the biologically relevant matches. This is a well known problem in searching for motifs, jokingly called "The Futility Theorem" ( Wasserman WW, Sandelin A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 2004;5:276-87.). Basically you will need to combine the motif with other sources of information. The forum has some more useful information under the tag FIMO.


A lot more information is available in the papers, on this forum or even on the website. If you have further questions please try looking at some of that information first.
Reply all
Reply to author
Forward
0 new messages