How do I make my own log-odds matrix for MAST?

159 views
Skip to first unread message

CharlesEGrant

unread,
Sep 28, 2012, 6:57:29 PM9/28/12
to meme-...@googlegroups.com
Originally posted by bedutilh 02 Oct 2006

How do I convert an aligned profile into a log-odds matrix like I need to run MAST? 

e.g. starting from a JASPAR profile: http://mordor.cgb.ki.se/cgi-bin/jaspar2005/jaspar_db.pl?ID=MA0060&rm=present&db=CORE 

Does it involve a pseudocount or something? How else do we handle the "expected" frequencies at unmutated sites like 6 and 9 in the MA0060 example? http://en.wikipedia.org/wiki/Log-odds 
A simple +1 pseudocount does not give the right result if I try it on a motif from a MEME output.

CharlesEGrant

unread,
Sep 28, 2012, 6:59:32 PM9/28/12
to meme-...@googlegroups.com
Originally posted by tbailey 07 Oct 2006

You can easily create a file containing one or more motifs in the proper format for MAST or Meta-MEME using the "transfac2meme" script that is distributed with MEME. This script converts TRANSFAC-style "count" matrices into MEME  style motifs. To use it, create a file containing your count matrices in TRANSFAC-style format: 

Quote:
ID Motif_name 
PO A C G T 
01 6 16 18 13 - 
02 15 18 18 2 - 
03 13 12 18 10 - 
04 0 4 49 0 - 
05 52 0 0 1 - 
06 0 1 0 52 - 
07 27 6 14 6 - 
08 7 13 23 10 - 
09 5 20 15 13 - 
10 7 13 19 14 - 
// 


You should replace "Motif_name" with a name of your choice (no spaces or tabs), and you may have more than one count matrix entry in your input file, and each should be given a different "Motif_name". Each motif in your TRANSFAC-style file should start with an "ID" line and end with a "//" line. The first line after the "ID" line is a header line giving the letters that each column represents. The following, number lines, give the letter counts for each position in the motif. 

To create the MEME version 3 motif file, do: 


Code:
transfac2meme motif.dat > motif.meme


where "motif.dat" is the name of a file containingTRANSFAC-style count matrices. Your MEME-style motifs will be in file "motif.meme". 

To see other options type: 
Code:
transfac2meme

to see the options allowed. For example, you can change the total pseudo-counts added to the columns of your count matrices when they are converted to log-odds matrices. The following command will create a motif file using a total of 0.5 pseudo-counts: 
Code:
transfac2meme -pseudo 0.5 motif.dat > motif.meme


Usually, you will want to specify a different background model than the default uniform model. For example, if you intend to search an organism with GC-content of 0.6, you should use this command: 
Code:
transfac2meme -bg bfile motif.dat > motif.meme

where bfile is a file containing: 
Quote:
A 0.2 
C 0.3 
G 0.3 
T 0.2 
Reply all
Reply to author
Forward
0 new messages