Background sequences

58 views
Skip to first unread message

J

unread,
Oct 20, 2023, 5:25:56 AM10/20/23
to MEME Suite Q&A
Dear all, 
I am trying to find some enriched motifs on a bunch of genes. To this end I got the cDNA sequences +- 3Kb for each and I compared some meme strategies:
MEME->SEA-> TOMTOM
SEA alone
XSTREME

Moreover, as read in a thread on this site, I follow some instructions in order to build the background sequences (https://groups.google.com/g/meme-suite/c/yNascbE8Tig/m/rb27JMuZlwsJ).
Then, I did the comparison between each result using a background, without using a background and using a background masked with RepeatMasker.
According to what I got, results obtained from using a background masked with RM and without a background were pretty similar but they differ regarding what I got with an unmasked background.
Then, from a biological point of view, do you think that masking background sequences makes sense?
I hope I was clear.
Thanks in advance.

cegrant

unread,
Oct 27, 2023, 5:06:46 PM10/27/23
to MEME Suite Q&A
MEME , STREME, and SEA, will always use a background. If you don't explicitly provide a background, they will generate one by from the input sequences or control sequences. Shuffling them if needed. The fact that you got different results when using an unmasked background is a pretty good indication you shouldn't be using the unmasked sequences for your background. The problem is that MEME, STREME, and SEA can't distinguish between biologically interesting motifs and the spurious motifs that show up in repetitive or low complexity regions. For example, if you have sequences that consist entirely of tandem repeats of GC MEME and STREME are going to think that GCGCGCGCGCGCGCGC is an extremely statistically significant motif even though it is not any sort of transcription factor binding site.
Reply all
Reply to author
Forward
0 new messages