How does ZOOPS model decides which dataset sequence contains a motif

123 views
Skip to first unread message

Kelian Hacid

unread,
Aug 11, 2015, 9:42:12 AM8/11/15
to MEME Suite Q&A



Hi,

I made MEME running over a dataset containing 24 sequences. I used ZOOPS model to run it. In the following image, you can see the motif location output:




As you can see, only 7 dataset sequences over 24 were considered. But changing the motif location option from "motif sites only" to "all sequences" I get this :




All the dataset sequences are described. And we can find other sites where the motif occures. But there are also other dataset sequences containing perfect matching sequences that were not chosen by the ZOOPS model. As I build those sequences I know that some sequences that were chosen are exactly equal to others that were not.


How does ZOOPS model decides which dataset sequence contains a motif ? I know that it deals with the gamma parameter of this model. This parameter is the prior probability of a sequence containing a motif occurence. But How this paramater is fixed, I cannot find out. Is it a random assumption made at the beginning of MEME process ?


Best regards,

Kélian

CharlesEGrant

unread,
Aug 11, 2015, 6:53:02 PM8/11/15
to meme-...@googlegroups.com
Hi Kélian,

MEME doesn't perform an exhaustive search for possible motifs and motif sites. It uses heuristics to make initial guesses about possible motifs and motif sites, and then tries to improve those guesses using EM. A greedy algorithm is used: as soon as MEME has sufficient evidence for a motif, it reports it, and starts looking for a different motif. It repeats this until it's found the number of motifs requested, or it's run out of time.

You can modify the number of candidate sites MEME has to identify before reporting a motif using the '-minsites' and '-maxsites' options. For the ZOOPS model, -minsites would default to 2, but you could increase it, and as you increase it, more sequences should show up in the "only motif sites" option.

Because of the heuristics and the greedy algorithm MEME is not guaranteed to find motifs in strictly decreasing order of significance. For that reason, it's recommended you initially set the number of motifs to be discovered to be more than one. If you'd set '-nmotifs 10' MEME would probably have identified at least a couple of clearly similar motifs, which would have been the signal to increase '-minsites' or to switch to the OOPS model.

Charles

Kelian Hacid

unread,
Aug 12, 2015, 8:12:02 AM8/12/15
to MEME Suite Q&A
Ok thank you Charles !


Reply all
Reply to author
Forward
0 new messages