Copy genomes

35 views
Skip to first unread message

Matthias Steinrücken

unread,
Jun 29, 2021, 1:58:17 PM6/29/21
to slim-discuss
Hi all,

Is there an easy way to do the following: I want to sample a haplotype from a population, and then add a duplicate of this haplotype to the population, replacing another haplotype. Such that the resulting population has two identical copies of the target haplotype

Something like this:

target1 = sample (p1.genomes, 1);
target2 = sample (p1.genomes, 1);
replace (target1, target2)

I figure this could be done with a nonWF model, but I would prefer to stick with a WF model. Moreover, it could probably be done by writing the population to a file, manipulating that file, and then reading it in again. But I was wondering whether is a better way.

The reason why I would want to do this is to introduce a beneficial allele at a given frequency (not only one copy), and make sure that all copies have an identical genetic background.

If this question has already been asked or is explained in the manual, I would greatly appreciate a pointer.

Thanks,
    Matthias

Ben Haller

unread,
Jun 29, 2021, 2:47:13 PM6/29/21
to slim-discuss
Hi Matthias!  OK, so there are a variety of answers to this depending on what you're trying to do.  One question is what the actual biology is that you're trying to model – *why*, biologically, do all copies of the beneficial allele have the same genetic background?  Another question is how you plan to analyze the data afterwards; if you plan to use tree-sequence recording, then you might need the actual ancestry of that shared genetic background to be shared as well, whereas if you don't, then you don't need to worry about that.  Here are some ideas:

- If this situation arises as the result of a hard selective sweep with hitchhiking, you might want to explicitly model that process.  Section 9.3 of the manual shows an approach.

- If this situation arises through some kind of horizontal gene transfer, then you might want to explicitly model *that* process.  Section 16.14 has an example of modeling HGT, in the context of haploid bacteria.

- Both of those approaches give you a sensible tree sequence, in which the shared region has a root in a common ancestor.  If that is not important to you, then you can simply copy mutations from one individual to another, as your code snippet suggests:

target1 = sample(p1.genomes, 1);
target2 = sample(p1.genomes, 1);
if (target2 != target1)
{
   target2.removeMutations();
   target2.addMutations(target1.mutations);
}

That will work in either a nonWF or a WF model, but it will produce an ancestry in tree-sequence recording that makes no sense, since the biological process through which this genetic pattern arose was not modeled.

You say "haplotype"; I'm not sure whether you want to replace *all* the mutations in target2, or just those within some subset of the genome.  If the latter, the code above is easy enough to modify; just remove/copy only the mutations within a given base position range.

I hope this helps; happy modeling!

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University


Matthias Steinrücken

unread,
Jun 29, 2021, 2:54:17 PM6/29/21
to slim-discuss
Hi Benjamin,

Thanks a bunch for the quick reply.

I think your suggested code snippet will do exactly what I want. I'll give it a try.

The scenario is a hard sweep, but I don't want there to be any recombination or mutation before the beneficial allele reaches a certain frequency. I guess phrasing it this way suggest other ways of implementing it...

Anyways, thanks again for the help.

Best,
    Matthias
Reply all
Reply to author
Forward
0 new messages