MEME to find motif IN repeat sequences

33 views
Skip to first unread message

Devinder Kaur

unread,
Jun 9, 2017, 12:14:32 PM6/9/17
to MEME Suite Q&A

I am working on the repeat elements (retrotransposons) which shared identity of 80-96% at DNA/RNA level. I have 2 subset of it (1) high and (2) low expressed copies. Can MEME be suitable to discover DNA/RNA motifs in 5' end internal promoter region which could differentiate the two subset and suggest the reason for high/low expression ? I tried with DREME but it didn't show up with any discriminating motif.

CharlesEGrant

unread,
Jun 14, 2017, 7:27:14 PM6/14/17
to MEME Suite Q&A
Either MEME or DREME may work for this, but there are some caveats. Both MEME and DREME discover motifs by identifying short sub-sequences that are statistically over-represented in you sequence data. If your sequences are all 80-96% identical, then both MEME and DREME may simply end up aligning the repeats, and may not be able to pick out smaller and more subtle structures. By providing primary and control sets DREME may be able to identify motifs enriched in the primary set, but not in the control set. Note though that DREME is limited to motifs <= 8 positions wide. 

You may also be able to use MEME in discriminative mode. If you are using the MEME web application, just click on the “Discriminative mode” radio button under the “Select the motif discovery mode” heading. Once you’ve done that, you’ll be able to upload primary and control sequences. As with your plan for DREME, you could use the highly expression sequences as the primary sequences, and the low expression as the control set. If you are using the command line version of MEME, you’ll need to generate a position specific prior file using the ‘psp-gen’ utility. This is described here: http://meme-suite.org/doc/psp-gen.html.  Once you’ve generated the position specific prior file, you run MEME using the ‘-psp’ option. The MEME command line options are described here: http://meme-suite.org/doc/meme.html.


For both MEME and DREME you may want to run two trials, switching the primary and control sequences between the high expression and low expression sets. Insofar as is possible, you’ll want to trim your input sequences to just the internal promoter region likely to contain the suspected motifs. That should cut down on the regions of nearly identical sequence that are really of no interest.

Let us know if this isn’t clear, or if you have further questions.

Reply all
Reply to author
Forward
0 new messages