can evolver generate many simulations with one rootseq

20 views
Skip to first unread message

sumei zhou

unread,
Feb 27, 2025, 7:09:26 AMFeb 27
to PAML discussion group

Hi,authors,
I'm using Evolver to simulate sequence, I'm studing how does the repeat sequence evolution, for example, the sequence (300bp) is made of 10 tandem repeat sequence, each repeat unit contain 30bp (10aa), each repeat unit are highly similar to each other within this species. I want to simulate the repeat evolution conseqeuce under neutral model, to compare the expected repeat similarity within species with observed data. I have noticed the caution that not use the fixed rootseq, but i do want the root sequence present as repeat structure, so I come up with 2 ways but I met some troubles:
(1) My priority idea is set sequence length as the repeat unit length(30bp) and change the <# replicates>, but I found this will generate new root seq for each replicates, I want to know can I get many simulation results under one random rootseq? According to my understanding, there should be many possible results for one start sequence, so it is reasonable to get many possible reults althought with same root seq.

(2) If the first one can't work, I wonder can I manually justify the nucleotide order of each simualtion' root seq when I get many independent simulation results(each with different root seq), to force them arrange as the repeat sequence? I think the mutation occur on while site of sequence is random and independent, so maybe this works? It seems like permute the sites, this confused me. I'm not sure if this tricky way is correct.

That's all I can think for now. It would be great if you can tell me any other solutions! I Hope to get some ideas from all of you, which mean a lot to me. Thank you for you patience!

Sandra AC

unread,
Feb 28, 2025, 4:49:24 AMFeb 28
to PAML discussion group
Hi there,

Thanks for your message. I am posting Ziheng's answer on his behalf, hope this helps!

I think your method 1 should be fine, simulating the evolution of repeats by using the same root sequence.  In this case the number of replicates should be the number of repeats.

Apparently the program will try to read the root sequence from the file

RootSeq.txt

if it exists.

You can do some small tests to confirm that the option is working.  For example you can use a small tree with branch lengths near 0 in which case the sequences at the tips should be nearly identical to the root sequence.  And then use larger branch lengths.  It should be very easy to tell if all replicates are generated using the same root sequence. 

ziheng


Reply all
Reply to author
Forward
0 new messages