FIMO output files generated by analysis with MEME-chip

Victoria

unread,

Jun 17, 2020, 8:12:25 AM6/17/20

to MEME Suite Q&A

Hi,

I performed motif discovery in a rather large sequence set of defined open chromatin regions with the command line version of MEME-chip and used the parameters -ccut 0 and -spamo-skip, otherwise default settings.

I am interested in where in my input sequence the significant motifs are located. As I understand it, this is provided in the FIMO output files generated by MEME-Chip for each significant motif.

I do however wonder why the FIMO output generated by MEME-chip gives a very different number of motif matches compared to when I use the FIMO web application? As input to the FIMO web application I use the same sequence as was used as input to MEME-chip and a selection of the significant motifs identified by analysis with MEME-chip. FIMO web application give a much higher number of matches for many of the motifs. I use default p-val threshold.

Does this have to do with the background sequence used by MEME-chip when scanning the sequence for individual matches with FIMO?

My set of open chromatin regions is larger than 1 Mb but the FIMO web application accepts the file. Could the size of the input file be the problem?

All in all, how do I best identify putative motif sites for the identified enriched motifs in my set of open chromatin regions to use in further analyses?

Victoria

cegrant

unread,

Jun 29, 2020, 6:34:06 PM6/29/20

to MEME Suite Q&A

By default FIMO will use a background derived from the NR database of nucleotide sequences. MEME-ChIP will use the observed nucleotide frequencies in the input sequences for the background model. This could certainly result in substantial differences.

It would be helpful if you could post copies of your MEME-ChIP and FIMO results. That would help us sort out any issues. When you are editing a posting you can attach a file by clicking ton the paper clip icon next to the "Post message" button.

Victoria

unread,

Jun 30, 2020, 7:29:14 AM6/30/20

to MEME Suite Q&A

Hi,

I do not think there are any issues with the programs, but instead the different backgrounds used by MEME-chip and FIMO causing the different results as you suggest. I used the command line version of FIMO perfoming an analysis with the same parameters and background used by MEME-Chip, and got the same results. I also used the command line version of FIMO to perform an analysis with default background and same parameters as used in FIMO webapplication, and got the same results as those generated with FIMO webapplication.

What is the most appropriate background? I analyse a large sequence set defining open chromatin sites in a human tissue. Is the observed nucleotide frequencies in the input sequences preferred over the NR database of nucleotide sequences?

I also notice that MEME-chip does not find any motif sites for any of the 6 bp motifs. Is this because of the low information content in such short motifs and inability to find matches with low enough p-value to pass the default threshold?

Victoria

cegrant

unread,

Jul 1, 2020, 11:28:38 PM7/1/20

to MEME Suite Q&A

The ideal background model would be derived from a collection of sequences that are biologically similar to the ones you wish to analyze, but that don't contain any instances of the motifs you are scanning for. In practice this is hard to come by, and the usual expedient to just use the nucleotide frequencies from the input sequences. This is what MEME-ChIP does by default, and is generally pretty good unless you have input sequences where instances of the motif make more than a couple per cent of the sequences.

This posting may be helpful.

Reply all

Reply to author

Forward