FIMO preferred input motif format

355 views
Skip to first unread message

richard...@gmail.com

unread,
Jul 26, 2013, 6:40:02 PM7/26/13
to meme-...@googlegroups.com

With respect to the FIMO program, I was hoping someone could tell what the preferred input motif format is? I have noticed FIMO runs without error if the motif file format specifies only a log-odds matrix or only a letter-probability matrix. If I use different formats of the same motif (from a MEME output file) then the FIMO results are slightly different. This is using exactly the same background model file as provided to MEME. Does FIMO convert the letter-probability matrix into a log-odds matrix using a different method to the one used in MEME? 

Any comments that help give me insight into what is happening "under the hood" and what is best practice would be appreciated.

Cheers,

Richard

James Johnson

unread,
Aug 4, 2013, 8:33:58 PM8/4/13
to
The majority of the MEME Suite programs use the same motif parser - that includes MAST, Tomtom, GOMo (or rather AMA), FIMO, MCAST, SpaMo CentriMo and AME. MEME and DREME only write motifs and MEME-ChIP just calls the other programs so in effect it never parses the motifs itself (Correction: MEME-ChIP does parse the XML motifs using a Perl module). The major odd ones out are GLAM and GLAM2 because they have a different motif format and the Perl script meme2meme which uses a Perl implementation of the parser and it has some limitations.

For historical reasons MAST also behaves slightly differently to the rest. MEME and MAST were the first programs in the MEME Suite and MAST was written to use the log-odds matrix in the motif. It used to have a wrapper script that was responsible for extracting out the log-odds matrix from the MEME motif file into a special format that was then passed to MAST. In version 4.4.0 I changed MAST to use the same motif parser as the rest of the MEME Suite and this meant that the motif parser had to be rewritten to additionally load the log-odds matrix which it was not using at all. We could have written it to use just the letter-probability matrix but at the time we decided not to change the parser to be able to create the log-odds matirx from the letter-probability matrix because in the case of protein motifs there is not enough information to perfectly recreate the log-odds matrix as MEME would have created it. Latter on we changed our minds and so for version 4.8.0 the motif parser was changed to be able to convert the letter-probability matrix into the log-odds matrix and vice versa when either was missing. To do this conversion it uses the background listed in the motif file to try to be as close to the result that MEME would have output as possible.

So to get back to your question. FIMO uses the letter-probability matrix but it is quite happy to convert a log-odds matrix into the letter-probablity matrix if you don't include it. The reason the results are slightly different is because the conversion is not perfect. MEME internally uses a much higher order background than the order-0 background it outputs so we simply don't have enough information to do a perfect conversion.
Reply all
Reply to author
Forward
0 new messages