Motif finding

90 views
Skip to first unread message

Fathima Ashraf

unread,
Aug 4, 2024, 11:57:18 AM8/4/24
to MEME Suite Q&A
I am currently working on a project to identify transcription factor binding sites (TFBSs) in the promoter regions of genes from 40 hierarchical clusters and I have encountered a challenge that I would like to know if STREME might be able to address.

How can I identify novel motifs in the promoter regions of genes from 40 hierarchical clusters, defining the promoter as the region 1000 bp upstream of the transcription start site (TSS)? To avoid confusion with TFBSs from neighboring genes that are unrelated to the genes I am studying, should the analysis be restricted to genes where the nearest upstream neighboring gene transcribing on the opposite strand is more than 1000 bp away? Or can STREME address this issue by ensuring that the identified novel motifs are specific to the defined promoter regions and do not include overlapping motifs from neighboring genes?

I would also like to know more about the site percentage or coverage that is a prt of the results section when running STREME. 

Thank you
Fathima Ashraf

cegrant

unread,
Aug 13, 2024, 7:56:49 PM8/13/24
to MEME Suite Q&A
At their root STREME and MEME are identifying short, similar subsequences that are statistically overrepresented in the input sequences. Neither STREME nor MEME incorporate any information about distances from neighboring genes. It’s up to you to select the sequences that you think are likely to contain instances of the motifs. You can also provide sequences that you expect >not< to contain instances of a motif as a negative control set.

Your best bet would be to select only those upstream regions that are within 1000bp of a single gene. Of course that may not be possible. Also, biology being biology, picking a limit of 1000bp is a rule of thumb, not a hard and fast rule. Furthermore, who’s to say that a motif might not function as a promoter for multiple genes with overlapping upstream regions? You may need to perform further analysis after STREME to establish which motifs are promoters for which genes. For example you may be able to use Tomtom to see if the discovered motifs match known motifs and then look up what is know about the function of that motif. You might consider using XSTREME which combines STREME with several other  type of motif analysis.
Reply all
Reply to author
Forward
0 new messages