How does MEME react while changing the input motif size and/or the input background

Kelian Hacid

unread,

Jul 14, 2015, 1:45:18 PM7/14/15

to meme-...@googlegroups.com

Hi,

I am making experiments to understand how MEME reacts while changing the input. Those experiments appeared to be meaningfull when I saw that for the same sequence A, changing the number of repetitions of a single motif from 3 to 4 changed totally the result. In 4 repetitions case the motif is found while for the 3 repetitions case it is not.

So I started to create my own artificial sequences to figure out what is the limit between MEME efficiency and the parameters of the input file (M: the size of the motif model ; B: the size of the background model).

I worked with one animo acids sequence using anr distribution model. I recorded the E-value (e-v) and the pourcentage (%) of the motif found compared to the real motif to evaluate the results.

In my artificial sequence the motif appeared two times. NB :If M=24 the real motif size is 12.

The results were:

For M<=24 MEME found the motif (%>=60%) if B is under a limit. Over this limit the motif is not found.

For M> 24 the results became to be more confusing. The motif is found only for very low background level ...

After that I studied the Expectation Maximization process but I couldn't figure out how to explain those results.

Regarding the algorithm complexity and the interdenpence of symbols while working with animo acid sequences, I understand that my experiments were not exhaustive. Yet are there criteria the input structure (for M an B) should respect to maximize MEME efficiency?

Best regards,

Kelian

cegrant

unread,

Jul 14, 2015, 3:43:04 PM7/14/15

to meme-...@googlegroups.com

Hi Kelian,

It's not surprising that MEME gets different results depending on how many copies of the motif are in the test file. MEME is performing a statistical analysis of the evidence for a motif, and you are changing the amount of evidence available! There are many parameters described in the documentation for MEME for tuning the analysis. For example the '-minsites' parameter sets the minimum number of possible motif sites that must be found for a candidate motif to be considered.

I'm not sure what you mean by, B, the size of the background model. All background models for a given alphabet will have the same size. Do you mean the order of the model?

Yet are there criteria the input structure (for M an B) should respect to maximize MEME efficiency?

The interaction between the background model, the motif model, and MEME's detection efficiency is very complex! It can't be well represented by just considering the width of the motif and the order of the background model. The one exception is that MEME is generally poor at finding short motifs (less than 7 residues or so). I don't know what your background is, but to tackle this problem you may need to spend some time reading on position weight matrices, log-likelihood scores, and information content. Those topic are beyond the scope of what we can help with, and are better taken up with your advisor or instructor.

Kelian Hacid

unread,

Jul 15, 2015, 1:41:23 PM7/15/15

to meme-...@googlegroups.com

Hi,

Thank you for your return.

I have just realized that I did not explain well why I made those experiments. I agree, adding repetitions of a motif affects clearly and logically the results (I have also made mistake calling B "the size of Background model". B was what can be called the noise in the sequence, or all the symbols located out of the motif repetitions.).

What was more interesting was that for 3 repetitions MEME did not find the motif. But when I added 10 symbols "AKLDMSEEC" for instance at the end of each repetitions, MEME found the motif. That is why I made the assumption that the size of the motif or the size of B affects MEME efficiency.

My tests were made to understand how and why MEME reacts. But unfortunately I could not figure out neither in documentation nor experimentation (the results were too unstable).

And I wonder if you have an idea about my assumption.

Best regards,

Kélian

cegrant

unread,

Jul 15, 2015, 2:56:03 PM7/15/15

to meme-...@googlegroups.com

Hi Kelian,

The "background model" in MEME has a very specific meaning, and I don't think you are referring to the same thing. In MEME the background model is a Markov model describing the probability distribution of random nucleotides or amnio acids in your sequence data. Please see the documentation for MEME background models.

As I said before, simply looking the size of the motif is an huge oversimplification. At the very least you'll need to consider the information content of the motifs.

We are happy to help with specific questions about running the tools in the MEME Suite, but helping you design your experiments is beyond the scope of help we can offer. This is something you should take up with your advisor or instructors.

Kelian Hacid

unread,

Jul 16, 2015, 8:12:23 AM7/16/15

to meme-...@googlegroups.com

Ok thank you Charles !

Reply all

Reply to author

Forward