FIMO background file

148 views
Skip to first unread message

Ireneusz Stolarek

unread,
Nov 17, 2016, 6:40:07 AM11/17/16
to MEME Suite Q&A

I'm running fimo on a FASTA sequences with or without particular SNP to look if some binding motif might be missing. The sequences are ~300 bp long. I get vastly different results when using background file calculated across whole chromosome (as representative for the organism) or when using background file calculated from the given FASTA that I am scanning with FIMO.

So which background file should I use?

CharlesEGrant

unread,
Nov 17, 2016, 7:18:16 PM11/17/16
to meme-...@googlegroups.com
The choice of background model has a huge impact on the performance of FIMO, but selecting a background model is a matter of judgement and compromise. Ideally you want to infer the background from sequences with nucleotide frequencies similar to the sequences you are going to scan, but that contain few or no instances of the motifs you are searching for. 

If you were scanning a full chromosome then deriving the background from the full chromosome would be a good choice. Even the presence of a few thousand copies of a motif wouldn't affect the background much.  

Providing an appropriate background for a small collection of sequences is harder because of issues like local GC bias. I'd start off by calculating the background from your FASTA file. The only problem might be if instances of your motifs make up a sizable percentage of your sequences. In that case FIMO would loose some of its statistical power to distinguish motif from background.
Reply all
Reply to author
Forward
0 new messages