Issue with Strand-Specific Motif Extraction

36 views
Skip to first unread message

Alba Cortés Coego

unread,
Oct 24, 2023, 9:31:52 AM10/24/23
to MEME Suite Q&A

Hello everyone,

I'm facing a challenge with MEME, as I need to extract the motifs in a strand-specific manner because I am interested on the motif surrounding a position that comes from an RNA. I have a bed file with the genomic coordinates and the strand and I use the parameter "Sites must be on the given strand", but it retrieves the sequence of the positive strand anyway.

I also tried MEME's "bed2fasta" tool to load the FASTA sequences instead of the coordinates, but it exhibits the same behavior, extracting sequences from the positive strand despite a (-) sign in the notation like follows:

>chr17:23705117-23705137(+) -
AGTGCCCTGGGCAAGTCTCA
>chr7:117670323-117670343(+) +
ATGGCACCTCTCCATGCCAT

If anyone has encountered a similar issue or has experience with strand-specific motif analysis using MEME, I would greatly appreciate any guidance or solutions you can offer.

Thank you in advance. Best regards,

Alba

cegrant

unread,
Oct 27, 2023, 3:15:05 PM10/27/23
to MEME Suite Q&A
Hi Alba,

MEME assumes that all the DNA sequences are reporting the sequence of the forward strand. MEME only uses the sequence header for the name of the sequence and doesn’t parse out the coordinates and strand information, even if it is present. By default, when analyzing DNA, MEME will assume the sequences represent the forward strand. MEME internally generates the reverse complement to check both strands for instances of a motif. If you select the “Search given strand only” MEME simply skips the internal generation of the reverse complement, and only considers only the forward strand.

It’s not clear from your question whether you are interested in RNA binding motifs or DNA binding motifs. It’s important to bear in mind that DNA binding motifs are generally NOT associated with a particular strand. DNA binding proteins generally act in the DNA major grove, and MEME will report both the forward and reverse complement orientations for a motif. RNA, of course, is single stranded. RNA binding motifs should only be reported in the “forward” direction. If the sequences analyzed are RNA, MEME should automatically detect this and it will use the RNA alphabet, and only consider motifs in the “forward” orientation. If you are looking for RNA binding motifs, but using the DNA sequences as a proxy for the RNA, then you will have to choose the “Search the given strand only”.

Reply all
Reply to author
Forward
0 new messages