How to extract motif list for each sequence from MEME or MAST outputs?

ahmedel...@gmail.com

unread,

Aug 6, 2016, 12:44:00 AM8/6/16

to MEME Suite Q&A

Hey,

I generated MEME outputs (txt,html, and xml) for a large number of protein sequences and I want to extract motif list for each protein like [seq1= 4,6,16,14,20,13,10,.....] and so on.

How can I do that?

Thanks

Best regards,

Ahmed

CharlesEGrant

unread,

Aug 8, 2016, 3:22:49 PM8/8/16

to MEME Suite Q&A

Hi Ahmed,

MEME is a primarily a motif discovery tool, not a motif scanning tool. MEME does report the results of a scan with the discovered motifs, but it does this using a fixed threshold for the match p-value of 0.0001.

The HTML and XML output for MEME both contain the results of the scan. In the HTML output it's contained in collection of JavaScript object declarations at the top of the file, and won't be easy for you to extract. The MEME XML output will probably be easier to work with. Look for the "scanned-sites" and "scanned-site" tags. You'll have to write your own script to extract this information and put it in the format you've asked for. Most popular scripting languages have libraries for reading and parsing XML files.

If you want more control of the scan for motif matches, you may want to use FIMO. Run FIMO on your sequence database using one of your MEME output files. FIMO generates output in tab delimited plain text, XML, and HTML formats. Again, you'll have to write your own script to put the information in the format you want. The tab delimited plain text or XML will be relatively easy to work with.

Reply all

Reply to author

Forward

Message has been deleted