Hi Everyone,
Thank you to the moderators for having me. I have some questions that I am looking for help with.
I am trying to run a simulation with SLiM to create a .fasta file containing simulated sequences of HIV from patients. My aim is to generate random sequences and store them in a .fasta file. I am also trying to include replicates in my simulated set of sequences to make the simulation closer to real data. I have written some code to be able to do that. But I am facing issues with minor things in code that I am finding difficult to address. Please help me out here. I will need to run the .fasta file it is generating in MEGAX to create a phylogenetic tree. I also want to store the .fasta file in my C: drive. Can you help with specifying that file path as well?
Here is my code.
initialize() {
// Define simulation parameters
defineConstant("genomeLength", 9000); // Simplified genome length
defineConstant("popSize", 107); // Initial population size
defineConstant("mutationRate", 1e-4); // High mutation rate
defineConstant("recombinationRate", 1e-5); // Recombination rate
defineConstant("selectionCoefficient", 0.1); // Selection coefficient for beneficial mutations
defineConstant("bases", c("A", "T", "C", "G")); // Possible bases
defineConstant("numReplicates", 3); // Number of replicates per individual
// Initialize mutation rate, types, and genomic elements
initializeMutationRate(mutationRate);
initializeMutationType("m1", 0.5, "f", 0.0); // Neutral mutations
initializeMutationType("m2", 0.5, "f", selectionCoefficient); // Beneficial mutations
initializeGenomicElementType("g1", c("m1", "m2"), c(9, 1)); // 90% neutral, 10% beneficial
initializeGenomicElement("g1", 0, genomeLength - 1);
initializeRecombinationRate(recombinationRate);
}
1 early() {
sim.addSubpop("p1", popSize);
}
10000 late() {
string filePath = "simulation_output.fasta";
integer sequenceLength = 9000; // Adjusted to full genome length for realistic simulation
string allIndividualData = "";
for (individual in p1.individuals) {
for (integer replicate = 1; replicate <= numReplicates; replicate++) {
string sequence = generateInitialSequence(sequenceLength); // Generate a random initial sequence
// Mutate sequence
sequence = mutateSequence(sequence, individual.genomes);
string header = ">Individual_" + (individual.index + 1) + "_Replicate_" + replicate + "\n";
string individualData = header + sequence + "\n";
allIndividualData = concat(allIndividualData, individualData);
}
}
writeFile(filePath, allIndividualData);
sim.simulationFinished();
}
// Generate a random initial sequence of the specified length
function string generateInitialSequence(integer length) {
string sequence = "";
for (integer i = 0; i < length; i++) {
sequence = concat(sequence, sample(bases, 1, TRUE));
}
return sequence;
}
// Apply mutations to a sequence based on the individual's genomes
function string mutateSequence(string sequence, object genomes) {
for (integer i = 0; i < length(sequence); i++) {
if (runif(1) < mutationRate) {
sequence[i] = sample(bases, 1, TRUE); // Mutate to a new random base
}
}
return sequence;
}
Any inputs are highly appreciated here. Thank you!
Sincerely,
Anushka Jain