The basics of using MEME to discover motifs

2,800 views

Skip to first unread message

CharlesEGrant

unread,

Apr 2, 2019, 8:26:54 PM4/2/19

to meme-...@googlegroups.com

MEME is a tool for performing de novo motif discovery in DNA, RNA, and proteins. This is a very brief summary of how to use the public MEME web application to discover motifs in your sequence data. Keep in mind that MEME is performing a statistical analysis of your sequences. Under the best circumstances, your sequence data must contain a least two copies of any motif for MEME to be able discover it. More copies may be need to provide statistically significant results.

1. Open the MEME web application in your web browser.

2. In the section titled "Input the primary sequences" you may upload a FASTA file containing your sequences by selecting "Upload sequences" and setting the name of the file containing your sequence data in the box labeled "Choose File". Alternatively, you can paste the sequence data directly into a text box by selecting "Type in sequences". The sequences you type in must be in FASTA format, including the header lines.

3. Set the site distribution model MEME should use:

Do you think each sequence contains one or zero instances of a motif? Select the 'Zero or one per sequence' model. This is the default, and usually a good staring point.
Do you think each sequence contains exactly one instance of a motif? Select the 'One per sequence' model'.
Do you think each sequence may contain any number of instances of a motif? Select the 'Any number of repetitions' model).

4. Adjust any of the other parameters on the page. Using the defaults is usually a good starting point. You may optionally provide your email address if you want to receive a link to your job via email.

5. Click on the "Start Search" button". You will be sent an email with a link to your job's results.

By default MEME will find 3 motifs. It tries to find the best motifs first but due to the enormous search space it is impossible to guarantee that they will always be listed

best to worst. You should always check the E-value of the motifs found by MEME as sometimes the motifs found will not be statistically significant. Generally if a motif

has an E-value larger than 0.05 it is not significant.

Important things to know:

The public MEME web application limits the size of the sequence file to 80MB and 500,000 sequences. If you need to analyze a larger file you'll need to install the MEME software locally. If your sequence data is highly redundant you may be able to use MEME-ChiIP instead.
The sequence file must be a plain text FASTA file. The MEME application is unable to process Microsoft Word files or compressed files. See Why can't the MEME Suite read my sequence data? for more information.
Jobs running longer than five hours will be halted.
Completed jobs are kept on the web site for four days. Be sure to download your results before they are removed!

Information about the various parameters can be found in the MEME command line documentation.

Information about the MEME algorithm can be found in these papers.

For a brief overview of the other tools in the MEME Suite see How do I use MEME suite for searching for motifs?